论文标题
挑战基于错误校正输出代码的DNN的对抗性鲁棒性
Challenging the adversarial robustness of DNNs based on error-correcting output codes
论文作者
论文摘要
对抗性例子的存在以及它们可以产生的轻松性引起了有关深度学习系统的几个安全问题,从而促使研究人员开发合适的防御机制。最近提出了采用错误校正输出代码(ECOC)的网络的使用,以应对在白色盒子设置中创建对抗性示例的创建。在本文中,我们对ECOC方法实现的对抗性鲁棒性进行了深入的研究。我们通过提出针对多标签分类体系结构(例如基于ECOC的一个)而设计的新的对抗攻击,并应用两次现有攻击。与以前的发现相反,我们的分析表明,基于ECOC的网络可以通过引入小小的对抗扰动来很容易地攻击。此外,可以通过这种方式生成对抗性示例,以实现预测目标类别的高概率,因此很难使用预测信心来检测它们。我们的发现是通过MNIST,CIFAR-10和GTSRB分类任务获得的实验结果证明的。
The existence of adversarial examples and the easiness with which they can be generated raise several security concerns with regard to deep learning systems, pushing researchers to develop suitable defense mechanisms. The use of networks adopting error-correcting output codes (ECOC) has recently been proposed to counter the creation of adversarial examples in a white-box setting. In this paper, we carry out an in-depth investigation of the adversarial robustness achieved by the ECOC approach. We do so by proposing a new adversarial attack specifically designed for multi-label classification architectures, like the ECOC-based one, and by applying two existing attacks. In contrast to previous findings, our analysis reveals that ECOC-based networks can be attacked quite easily by introducing a small adversarial perturbation. Moreover, the adversarial examples can be generated in such a way to achieve high probabilities for the predicted target class, hence making it difficult to use the prediction confidence to detect them. Our findings are proven by means of experimental results obtained on MNIST, CIFAR-10 and GTSRB classification tasks.