分数LP威胁模型的可证明的对抗性鲁棒性

论文标题

分数LP威胁模型的可证明的对抗性鲁棒性

Provable Adversarial Robustness for Fractional Lp Threat Models

论文作者

Levine, Alexander, Feizi, Soheil

论文摘要

近年来，研究人员在包括L_0，L_1，L_2和L_Infinity-Norm边界对抗性攻击等各种威胁模型中广泛研究了对抗性鲁棒性。但是，由分数L_P“规范”界定的攻击（由L_P距离定义为0 <p <1）尚待彻底考虑。我们主动提出具有多种理想特性的防御：它提供可证明的（经认证的）鲁棒性，对成像网的比例，并且在应用于量化数据（例如图像）时，可以确定性（而不是高概率）认证的保证。对于任何0 <p <1，我们对分数L_P鲁棒性构建的技术是表达的，深层的分类器，这些分类器是全球lipchitz的。但是，我们的方法更加笼统：我们可以构建与定义为组件凹函数的任何度量的全球Lipchitz的分类器。我们的方法是基于最近的作品，莱文和费齐（2021），该作品为L_1攻击提供了可证明的辩护。但是，我们证明，与直接使用（Levine and Feizi，2021）的微不足道解决方案相比，我们提出的保证是高度无效的，并采用了规范不平等。代码可从https://github.com/alevine0/fractionallprobustness获得。

In recent years, researchers have extensively studied adversarial robustness in a variety of threat models, including L_0, L_1, L_2, and L_infinity-norm bounded adversarial attacks. However, attacks bounded by fractional L_p "norms" (quasi-norms defined by the L_p distance with 0<p<1) have yet to be thoroughly considered. We proactively propose a defense with several desirable properties: it provides provable (certified) robustness, scales to ImageNet, and yields deterministic (rather than high-probability) certified guarantees when applied to quantized data (e.g., images). Our technique for fractional L_p robustness constructs expressive, deep classifiers that are globally Lipschitz with respect to the L_p^p metric, for any 0<p<1. However, our method is even more general: we can construct classifiers which are globally Lipschitz with respect to any metric defined as the sum of concave functions of components. Our approach builds on a recent work, Levine and Feizi (2021), which provides a provable defense against L_1 attacks. However, we demonstrate that our proposed guarantees are highly non-vacuous, compared to the trivial solution of using (Levine and Feizi, 2021) directly and applying norm inequalities. Code is available at https://github.com/alevine0/fractionalLpRobustness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题