自我矛盾在主动学习中的重要性对于语义细分

论文标题

自我矛盾在主动学习中的重要性对于语义细分

Importance of Self-Consistency in Active Learning for Semantic Segmentation

论文作者

Golestaneh, S. Alireza, Kitani, Kris M.

论文摘要

我们在语义分割的背景下解决了主动学习的任务，并表明自我矛盾可以是自我划分的有力来源，可以极大地提高数据驱动模型的性能，仅访问少量标记的数据。自一致性使用了一个简单的观察，即特定图像的语义分割结果不应在水平翻转（即仅应翻转）之类的转换下变化。换句话说，模型的输出应在模棱两可的转换下保持一致。在积极学习过程中，自隔离的自我探讨信号特别有用，因为在只有少量标记的培训数据时，该模型容易过度拟合。在我们提出的主动学习框架中，我们通过选择在eproimiant转换下具有高不确定性（高熵）的图像贴片来迭代提取需要标记的小图像贴片。我们在每个图像的分割网络输出及其转换（水平翻转）之间强制执行像素的自持矛盾，以利用丰富的自我探讨信息并降低网络的不确定性。通过这种方式，我们能够找到当前模型最努力进行分类的图像补丁。通过对这些困难的图像补丁进行迭代训练，我们的实验表明，我们的主动学习方法达到了$ \ sim96 \％$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $在所有数据上培训的模型的最高性能，仅使用基准语义语义细分数据集的总数据的$ 12 \％$（例如，Camvid和CityScapes）。

We address the task of active learning in the context of semantic segmentation and show that self-consistency can be a powerful source of self-supervision to greatly improve the performance of a data-driven model with access to only a small amount of labeled data. Self-consistency uses the simple observation that the results of semantic segmentation for a specific image should not change under transformations like horizontal flipping (i.e., the results should only be flipped). In other words, the output of a model should be consistent under equivariant transformations. The self-supervisory signal of self-consistency is particularly helpful during active learning since the model is prone to overfitting when there is only a small amount of labeled training data. In our proposed active learning framework, we iteratively extract small image patches that need to be labeled, by selecting image patches that have high uncertainty (high entropy) under equivariant transformations. We enforce pixel-wise self-consistency between the outputs of segmentation network for each image and its transformation (horizontally flipped) to utilize the rich self-supervisory information and reduce the uncertainty of the network. In this way, we are able to find the image patches over which the current model struggles the most to classify. By iteratively training over these difficult image patches, our experiments show that our active learning approach reaches $\sim96\%$ of the top performance of a model trained on all data, by using only $12\%$ of the total data on benchmark semantic segmentation datasets (e.g., CamVid and Cityscapes).

下载PDF全文

下载文献需遵守相关版权规定

论文标题