针对逃避攻击的强大学习决策清单的样本复杂性范围

论文标题

针对逃避攻击的强大学习决策清单的样本复杂性范围

Sample Complexity Bounds for Robustly Learning Decision Lists against Evasion Attacks

论文作者

Gourdeau, Pascale, Kanade, Varun, Kwiatkowska, Marta, Worrell, James

论文摘要

对抗机器学习中的一个基本问题是量化在逃避攻击的情况下需要多少培训数据。在本文中，我们在PAC学习的框架内解决了这个问题，重点是决策列表。鉴于分布假设在对抗环境中至关重要，因此我们在满足Lipschitz条件的输入数据上使用概率分布：附近的点具有相似的概率。我们的主要结果表明，对手的预算（即，每个输入中可能会扰动的位数）是确定强大学习的样本复杂性的基本数量。我们的第一个主要结果是样本复杂性下限：单调连词类别（基本上是布尔超同类数据库上最简单的非平凡假设类别），并且任何超级类别都至少在敌方预算中具有样品复杂性。我们的第二个主要结果是相应的上限：对于每个固定的$ k $，$ k $ - 决策列表的类别具有多项式样本复杂性，而$ \ log（n）$ - 有限的对手。这进一步阐明了一个问题，即是否始终将有效的PAC学习算法用作有效的$ \ log（n）$ - 在均匀分布下的鲁棒学习算法。

A fundamental problem in adversarial machine learning is to quantify how much training data is needed in the presence of evasion attacks. In this paper we address this issue within the framework of PAC learning, focusing on the class of decision lists. Given that distributional assumptions are essential in the adversarial setting, we work with probability distributions on the input data that satisfy a Lipschitz condition: nearby points have similar probability. Our key results illustrate that the adversary's budget (that is, the number of bits it can perturb on each input) is a fundamental quantity in determining the sample complexity of robust learning. Our first main result is a sample-complexity lower bound: the class of monotone conjunctions (essentially the simplest non-trivial hypothesis class on the Boolean hypercube) and any superclass has sample complexity at least exponential in the adversary's budget. Our second main result is a corresponding upper bound: for every fixed $k$ the class of $k$-decision lists has polynomial sample complexity against a $\log(n)$-bounded adversary. This sheds further light on the question of whether an efficient PAC learning algorithm can always be used as an efficient $\log(n)$-robust learning algorithm under the uniform distribution.

下载PDF全文

下载文献需遵守相关版权规定

论文标题