数据集偏置的对抗过滤器

论文标题

数据集偏置的对抗过滤器

Adversarial Filters of Dataset Biases

论文作者

Bras, Ronan Le, Swayamdipta, Swabha, Bhagavatula, Chandra, Zellers, Rowan, Peters, Matthew E., Sabharwal, Ashish, Choi, Yejin

论文摘要

大型神经模型表明了人类水平的语言和视觉基准表现，而其性能在对抗性或分布样本上大大降低。这提出了一个问题，即这些模型是否学会了通过过度拟合虚假的数据集偏见来解决数据集而不是基本任务。我们调查了一种最近提出的方法Aflite，该方法会过滤此类数据集偏见，以减轻机器性能的普遍高估。我们通过将其置于最佳偏差的广义框架中，为Aflite提供理论上的理解。我们提供了广泛的支持证据，表明Aflite广泛适用于减少可测量的数据集偏见，并且在过滤后数据集中训练的模型可以更好地概括到分布式任务。最后，过滤导致模型性能下降（例如，SNLI的92％至62％），而人类的性能仍然很高。因此，我们的工作表明，这种过滤的数据集可以通过作为升级的基准来提出新的研究挑战，以实现强大的概括。

Large neural models have demonstrated human-level performance on language and vision benchmarks, while their performance degrades considerably on adversarial or out-of-distribution samples. This raises the question of whether these models have learned to solve a dataset rather than the underlying task by overfitting to spurious dataset biases. We investigate one recently proposed approach, AFLite, which adversarially filters such dataset biases, as a means to mitigate the prevalent overestimation of machine performance. We provide a theoretical understanding for AFLite, by situating it in the generalized framework for optimum bias reduction. We present extensive supporting evidence that AFLite is broadly applicable for reduction of measurable dataset biases, and that models trained on the filtered datasets yield better generalization to out-of-distribution tasks. Finally, filtering results in a large drop in model performance (e.g., from 92% to 62% for SNLI), while human performance still remains high. Our work thus shows that such filtered datasets can pose new research challenges for robust generalization by serving as upgraded benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题