经验还是不变的风险最小化？样本复杂性的观点

论文标题

经验还是不变的风险最小化？样本复杂性的观点

Empirical or Invariant Risk Minimization? A Sample Complexity Perspective

论文作者

Ahuja, Kartik, Wang, Jun, Dhurandhar, Amit, Shanmugam, Karthikeyan, Varshney, Kush R.

论文摘要

最近，提出了不变的风险最小化（IRM）作为解决分布（OOD）概括的有前途的解决方案。但是，目前尚不清楚何时应优先使用广泛的经验风险最小化（ERM）框架。在这项工作中，我们从样本复杂性的角度分析了这两个框架，从而迈出了一个坚定的一步，朝着回答这一重要问题。我们发现，根据数据生成机制的类型，这两种方法可能具有有限的样本和渐近行为。例如，在协变性偏移设置中，我们看到两种方法不仅达到了相同的渐近解决方案，而且具有相似的有限样本行为，没有明显的赢家。但是，对于其他分布变化，例如涉及混杂因素或反毒物变量的变化，两种方法到达了不同的渐近解决方案，在这些方法中，保证IRM可以接近有限样品状态中所需的OOD溶液，而ERM甚至偶然地偏见。我们进一步研究了不同因素（环境的数量，模型的复杂性和IRM惩罚权重）如何影响IRM的样本复杂性与其与OOD溶液距离有关

Recently, invariant risk minimization (IRM) was proposed as a promising solution to address out-of-distribution (OOD) generalization. However, it is unclear when IRM should be preferred over the widely-employed empirical risk minimization (ERM) framework. In this work, we analyze both these frameworks from the perspective of sample complexity, thus taking a firm step towards answering this important question. We find that depending on the type of data generation mechanism, the two approaches might have very different finite sample and asymptotic behavior. For example, in the covariate shift setting we see that the two approaches not only arrive at the same asymptotic solution, but also have similar finite sample behavior with no clear winner. For other distribution shifts such as those involving confounders or anti-causal variables, however, the two approaches arrive at different asymptotic solutions where IRM is guaranteed to be close to the desired OOD solutions in the finite sample regime, while ERM is biased even asymptotically. We further investigate how different factors -- the number of environments, complexity of the model, and IRM penalty weight -- impact the sample complexity of IRM in relation to its distance from the OOD solutions

下载PDF全文

下载文献需遵守相关版权规定

论文标题