群体不变的学习何时生存在虚假的相关性中？

论文标题

群体不变的学习何时生存在虚假的相关性中？

When Does Group Invariant Learning Survive Spurious Correlations?

论文作者

Chen, Yimeng, Xiong, Ruibin, Ma, Zhiming, Lan, Yanyan

论文摘要

通过推断培训数据中的潜在群体，最近的著作将不可用的注释不可用的情况引入了不变的学习。通常，在大多数/少数族裔分裂下学习群体不变性在经验上被证明可以有效地改善许多数据集的分布泛化。但是，缺乏这些关于学习不变机制的方法的理论保证。在本文中，我们揭示了在防止分类器依赖于培训集中的虚假相关性的情况下，现有的小组不变学习方法的不足。具体来说，我们提出了两个关于判断这种充分性的标准。从理论和经验上讲，我们表明现有方法可以违反标准，因此无法推广出虚假的相关性转移。在此激励的基础上，我们设计了一种新的组不变学习方法，该方法构建具有统计独立测试的组，并按组标签重新重量样本以满足标准。对合成数据和实际数据的实验表明，新方法在推广到虚假相关性转移方面显着优于现有的组不变学习方法。

By inferring latent groups in the training data, recent works introduce invariant learning to the case where environment annotations are unavailable. Typically, learning group invariance under a majority/minority split is empirically shown to be effective in improving out-of-distribution generalization on many datasets. However, theoretical guarantee for these methods on learning invariant mechanisms is lacking. In this paper, we reveal the insufficiency of existing group invariant learning methods in preventing classifiers from depending on spurious correlations in the training set. Specifically, we propose two criteria on judging such sufficiency. Theoretically and empirically, we show that existing methods can violate both criteria and thus fail in generalizing to spurious correlation shifts. Motivated by this, we design a new group invariant learning method, which constructs groups with statistical independence tests, and reweights samples by group label proportion to meet the criteria. Experiments on both synthetic and real data demonstrate that the new method significantly outperforms existing group invariant learning methods in generalizing to spurious correlation shifts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题