结构一致的弱监督的显着对象检测具有局部显着性连贯性

论文标题

结构一致的弱监督的显着对象检测具有局部显着性连贯性

Structure-Consistent Weakly Supervised Salient Object Detection with Local Saliency Coherence

论文作者

Yu, Siyue, Zhang, Bingfeng, Xiao, Jimin, Lim, Eng Gee

论文摘要

近年来，稀疏标签吸引了很多关注。但是，弱监督和完全监督的显着对象检测方法之间的性能差距很大，而且大多数以前弱监督的作品采用许多铃铛和哨子采用复杂的培训方法。在这项工作中，我们提出了一种一轮端到端培训方法，用于通过涂鸦注释弱监督的显着对象检测，而无需预/后处理操作或额外的监督数据。由于涂鸦标签无法提供详细的明显区域，因此我们提出了局部连贯性损失，以基于图像特征和像素距离传播标签，以将标签传播到未标记的区域，以预测具有完整对象结构的积分明显区域。我们将显着性结构的一致性损失设计为自洽的机制，以确保用与输入相同的图像的不同尺度预测一致的显着图，可以将其视为一种正则化技术，以增强模型通用能力。此外，我们设计了一个聚合模块（AGGM），以更好地整合高级功能，低级功能和全局上下文信息，以供解码器汇总各种信息。广泛的实验表明，我们的方法在六个基准上实现了新的最先进的性能（例如，对于ECSD数据集：F_β= 0.8995，E_PON = 0.9079 = 0.9079和MAE = 0.0489 $），对于F-measure的平均增益为4.60 \ for for for for for for for for for e-sease for e-for this MAE和1.88 for for for for for this-for for for this-mae for for MAE和1.88 for for 1.88。源代码可在http://github.com/siyueyu/scwssod上找到。

Sparse labels have been attracting much attention in recent years. However, the performance gap between weakly supervised and fully supervised salient object detection methods is huge, and most previous weakly supervised works adopt complex training methods with many bells and whistles. In this work, we propose a one-round end-to-end training approach for weakly supervised salient object detection via scribble annotations without pre/post-processing operations or extra supervision data. Since scribble labels fail to offer detailed salient regions, we propose a local coherence loss to propagate the labels to unlabeled regions based on image features and pixel distance, so as to predict integral salient regions with complete object structures. We design a saliency structure consistency loss as self-consistent mechanism to ensure consistent saliency maps are predicted with different scales of the same image as input, which could be viewed as a regularization technique to enhance the model generalization ability. Additionally, we design an aggregation module (AGGM) to better integrate high-level features, low-level features and global context information for the decoder to aggregate various information. Extensive experiments show that our method achieves a new state-of-the-art performance on six benchmarks (e.g. for the ECSSD dataset: F_β= 0.8995, E_ξ= 0.9079 and MAE = 0.0489$), with an average gain of 4.60\% for F-measure, 2.05\% for E-measure and 1.88\% for MAE over the previous best method on this task. Source code is available at http://github.com/siyueyu/SCWSSOD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题