认证模型合奏的联合对抗鲁棒性

论文标题

认证模型合奏的联合对抗鲁棒性

Certifying Joint Adversarial Robustness for Model Ensembles

论文作者

Jonas, Mainuddin Ahmad, Evans, David

论文摘要

深层神经网络（DNN）通常容易受到对抗性例子的影响。拟议的辩护部署了模型的合奏，希望尽管各个模型可能很脆弱，但对手将无法找到一个对抗合奏的对抗性示例。根据攻击者的使用方式，攻击者可能需要找到一个对抗所有人或大多数模型的对抗性示例。合奏防御对手的有效性取决于合奏中模型的脆弱空间是脱节的。我们考虑了模型集团的共同脆弱性，并提出了一种新型技术，以证明合奏的共同鲁棒性，并基于单模鲁棒性认证的先前作品。我们评估了各种模型合奏的鲁棒性，包括使用成本敏感的鲁棒性训练的模型，以提高对合奏模型的潜在有效性的理解，以此作为针对对抗性例子的防御。

Deep Neural Networks (DNNs) are often vulnerable to adversarial examples.Several proposed defenses deploy an ensemble of models with the hope that, although the individual models may be vulnerable, an adversary will not be able to find an adversarial example that succeeds against the ensemble. Depending on how the ensemble is used, an attacker may need to find a single adversarial example that succeeds against all, or a majority, of the models in the ensemble. The effectiveness of ensemble defenses against strong adversaries depends on the vulnerability spaces of models in the ensemble being disjoint. We consider the joint vulnerability of an ensemble of models, and propose a novel technique for certifying the joint robustness of ensembles, building upon prior works on single-model robustness certification. We evaluate the robustness of various models ensembles, including models trained using cost-sensitive robustness to be diverse, to improve understanding of the potential effectiveness of ensemble models as a defense against adversarial examples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题