将二进制分类与对抗性扰动进行鲁棒性分类

论文标题

将二进制分类与对抗性扰动进行鲁棒性分类

Robustifying Binary Classification to Adversarial Perturbation

论文作者

Salehi, Fariborz, Hassibi, Babak

论文摘要

尽管机器学习模型在各种应用程序中取得了巨大的成功，但这些模型中的大多数都缺乏对其输入数据（甚至很小）扰动的韧性。因此，强大的机器学习模型的新方法似乎非常重要。为此，在本文中，我们考虑了对对抗性扰动的二进制分类问题。研究最小值优化的解决方案（考虑到存在对抗扰动的情况下最严重的案例损失），我们对Max-Margin分类器进行了概括，该分类器考虑了对手在操纵数据时的功能。我们将此分类器称为“可靠的max-margin”（RM）分类器。在一些对损耗函数的轻度假设下，我们从理论上表明梯度下降迭代（步长足够小）会沿其方向收敛到RM分类器。因此，可以研究RM分类器，以计算使用对抗性扰动的二进制分类的各种性能度量（例如概括误差）。

Despite the enormous success of machine learning models in various applications, most of these models lack resilience to (even small) perturbations in their input data. Hence, new methods to robustify machine learning models seem very essential. To this end, in this paper we consider the problem of binary classification with adversarial perturbations. Investigating the solution to a min-max optimization (which considers the worst-case loss in the presence of adversarial perturbations) we introduce a generalization to the max-margin classifier which takes into account the power of the adversary in manipulating the data. We refer to this classifier as the "Robust Max-margin" (RM) classifier. Under some mild assumptions on the loss function, we theoretically show that the gradient descent iterates (with sufficiently small step size) converge to the RM classifier in its direction. Therefore, the RM classifier can be studied to compute various performance measures (e.g. generalization error) of binary classification with adversarial perturbations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题