论文标题
Wasserstein分布在强大的优化方面的有限样本保证:打破维度的诅咒
Finite-Sample Guarantees for Wasserstein Distributionally Robust Optimization: Breaking the Curse of Dimensionality
论文作者
论文摘要
Wasserstein分布在强大的优化方面(DRO)旨在通过对抗Wasserstein距离的数据扰动来找到强大而可推广的溶液。尽管最近在运营研究和机器学习方面取得了经验成功,但由于维度的诅咒,现有的通用损失功能的绩效保证要么过于保守,要么仅在大型样本渐近学中才有合理。在本文中,我们开发了一个非质子框架,用于分析Wasserstein稳健学习的样本外观,并针对其相关Lipschitz和梯度正则化问题构成的概括。据我们所知,这为通用的Wasestein Dro问题提供了第一个有限样本的保证,而不会受到维数的诅咒。我们的结果强调了Wasserstein Dro,具有正确选择的半径,在损失的经验平均值与损失变化之间的平衡,通过Lipschitz Norm或损失的梯度标准来衡量。我们的分析基于两个具有独立感兴趣的新型方法论发展:1)通过损失的变化来控制大偏差概率的衰减率的新浓度不平等,以及2)基于损失变化的局部Rademacher复杂性理论。
Wasserstein distributionally robust optimization (DRO) aims to find robust and generalizable solutions by hedging against data perturbations in Wasserstein distance. Despite its recent empirical success in operations research and machine learning, existing performance guarantees for generic loss functions are either overly conservative due to the curse of dimensionality, or plausible only in large sample asymptotics. In this paper, we develop a non-asymptotic framework for analyzing the out-of-sample performance for Wasserstein robust learning and the generalization bound for its related Lipschitz and gradient regularization problems. To the best of our knowledge, this gives the first finite-sample guarantee for generic Wasserstein DRO problems without suffering from the curse of dimensionality. Our results highlight that Wasserstein DRO, with a properly chosen radius, balances between the empirical mean of the loss and the variation of the loss, measured by the Lipschitz norm or the gradient norm of the loss. Our analysis is based on two novel methodological developments that are of independent interest: 1) a new concentration inequality controlling the decay rate of large deviation probabilities by the variation of the loss and, 2) a localized Rademacher complexity theory based on the variation of the loss.