PatchMix扩展以识别几次学习中的因果特征

论文标题

PatchMix扩展以识别几次学习中的因果特征

PatchMix Augmentation to Identify Causal Features in Few-shot Learning

论文作者

Xu, Chengming, Liu, Chen, Sun, Xinwei, Yang, Siqian, Wang, Yabiao, Wang, Chengjie, Fu, Yanwei

论文摘要

几次学习（FSL）的任务旨在将带有足够标记数据的基本类别中学到的知识转移到具有稀缺的已知信息的新颖类别中。目前，这是一个重要的研究问题，在现实世界应用中具有巨大的实践价值。尽管以前的几个学习任务做出了广泛的努力，但我们强调，大多数现有方法并未考虑到FSL方案中样本选择偏差引起的分配转移。这种选择偏见可以诱导语义因果特征之间的虚假相关性，这些特征在因果关系和语义上与班级标签和其他非因果特征相关。至关重要的是，前者应该在分布的变化中不变，这与兴趣类别高度相关，因此可以推广到新颖的类别，而后者对分布的变化不稳定。为了解决这个问题，我们提出了一种新颖的数据增强策略，称为PatchMix，可以通过使用来自查询类别的不同类别的随机图库图像替换贴片级信息和对查询图像的监督来打破这种虚假的依赖性。从理论上讲，我们表明，这种增强机制与现有的机制不同，能够识别因果特征。为了进一步使这些特征足以分类，我们提出了相关引导的重建（CGR）和硬度感知的模块，例如歧视和更容易的类似类别的区分。此外，这样的框架可以适应无监督的FSL方案。

The task of Few-shot learning (FSL) aims to transfer the knowledge learned from base categories with sufficient labelled data to novel categories with scarce known information. It is currently an important research question and has great practical values in the real-world applications. Despite extensive previous efforts are made on few-shot learning tasks, we emphasize that most existing methods did not take into account the distributional shift caused by sample selection bias in the FSL scenario. Such a selection bias can induce spurious correlation between the semantic causal features, that are causally and semantically related to the class label, and the other non-causal features. Critically, the former ones should be invariant across changes in distributions, highly related to the classes of interest, and thus well generalizable to novel classes, while the latter ones are not stable to changes in the distribution. To resolve this problem, we propose a novel data augmentation strategy dubbed as PatchMix that can break this spurious dependency by replacing the patch-level information and supervision of the query images with random gallery images from different classes from the query ones. We theoretically show that such an augmentation mechanism, different from existing ones, is able to identify the causal features. To further make these features to be discriminative enough for classification, we propose Correlation-guided Reconstruction (CGR) and Hardness-Aware module for instance discrimination and easier discrimination between similar classes. Moreover, such a framework can be adapted to the unsupervised FSL scenario.

下载PDF全文

下载文献需遵守相关版权规定

论文标题