自动捷径删除以进行自我监督的代表学习

论文标题

自动捷径删除以进行自我监督的代表学习

Automatic Shortcut Removal for Self-Supervised Representation Learning

论文作者

Minderer, Matthias, Bachem, Olivier, Houlsby, Neil, Tschannen, Michael

论文摘要

在自我监督的视觉表示学习中，对特征提取器进行了“借口”任务的培训，该任务可以在没有人类注释的情况下便宜地生成标签。这种方法中的一个核心挑战是，功能提取器迅速学会利用低级视觉特征，例如颜色畸变或水印，然后无法学习有用的语义表示。确定此类“快捷方式”功能和手工设计方案以减少其效果，已经进行了许多工作。在这里，我们提出了一个一般框架，以减轻效果快捷特征。我们的关键假设是，那些要解决借口任务的功能也可能是最容易受到培训以使任务更加严重的对手的功能。我们表明，通过训练“镜头”网络来进行小图像更改，以最大程度地降低借口任务的性能，从而跨越了共同的借口任务和数据集跨越了共同的借口任务和数据集。用修改的图像学到的表示，在所有经过测试的情况下都没有学到的东西。此外，镜头进行的修改揭示了借口任务和数据集的选择如何影响自学学到的特征。

In self-supervised visual representation learning, a feature extractor is trained on a "pretext task" for which labels can be generated cheaply, without human annotation. A central challenge in this approach is that the feature extractor quickly learns to exploit low-level visual features such as color aberrations or watermarks and then fails to learn useful semantic representations. Much work has gone into identifying such "shortcut" features and hand-designing schemes to reduce their effect. Here, we propose a general framework for mitigating the effect shortcut features. Our key assumption is that those features which are the first to be exploited for solving the pretext task may also be the most vulnerable to an adversary trained to make the task harder. We show that this assumption holds across common pretext tasks and datasets by training a "lens" network to make small image changes that maximally reduce performance in the pretext task. Representations learned with the modified images outperform those learned without in all tested cases. Additionally, the modifications made by the lens reveal how the choice of pretext task and dataset affects the features learned by self-supervision.

下载PDF全文

下载文献需遵守相关版权规定

论文标题