论文标题
单阶段开放世界实例分割,具有交叉任务一致性正则化
Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization
论文作者
论文摘要
Open-World实例细分(OWIS)是一个新兴的研究主题,旨在从图像中划分类别的对象实例。主流方法使用两个阶段分割框架,该框架首先找到候选对象边界框,然后执行实例分割。在这项工作中,我们为OWIS推广了一个单阶段框架。我们认为,单阶段框架中的端到端训练过程可以更方便地直接正规化类不稳定对象像素的本地化。基于单阶段实例分割框架,我们提出了一个正则化模型,以预测前景像素并使用其与实例分割的关系来构建交叉任务一致性损失。我们表明,这种一致性损失可以减轻实例注释不完整的问题 - 现有的OWIS数据集中的常见问题。我们还表明,拟议的损失将自己用于半监督的OWIS的有效解决方案,这可以被认为是某些图像中所有对象注释都不存在的极端情况。我们的广泛实验表明,所提出的方法在完全监督和半监督的设置中取得了令人印象深刻的结果。与SOTA方法相比,提出的方法将$ ap_ {100} $得分显着提高了4.75 \%\%\%\%\ rightarrow $ uvo设置和4.05 \%\%\%\%\%\%\ rightarrow $ uvo设置。在半监督学习的情况下,我们的模型仅使用30 \%标记的数据学习,甚至超过了50 \%标记的数据的全面监督者。该代码将很快发布。
Open-World Instance Segmentation (OWIS) is an emerging research topic that aims to segment class-agnostic object instances from images. The mainstream approaches use a two-stage segmentation framework, which first locates the candidate object bounding boxes and then performs instance segmentation. In this work, we instead promote a single-stage framework for OWIS. We argue that the end-to-end training process in the single-stage framework can be more convenient for directly regularizing the localization of class-agnostic object pixels. Based on the single-stage instance segmentation framework, we propose a regularization model to predict foreground pixels and use its relation to instance segmentation to construct a cross-task consistency loss. We show that such a consistency loss could alleviate the problem of incomplete instance annotation -- a common problem in the existing OWIS datasets. We also show that the proposed loss lends itself to an effective solution to semi-supervised OWIS that could be considered an extreme case that all object annotations are absent for some images. Our extensive experiments demonstrate that the proposed method achieves impressive results in both fully-supervised and semi-supervised settings. Compared to SOTA methods, the proposed method significantly improves the $AP_{100}$ score by 4.75\% in UVO$\rightarrow$UVO setting and 4.05\% in COCO$\rightarrow$UVO setting. In the case of semi-supervised learning, our model learned with only 30\% labeled data, even outperforms its fully-supervised counterpart with 50\% labeled data. The code will be released soon.