论文标题
针对位置优化的对抗斑块的对抗训练
Adversarial Training against Location-Optimized Adversarial Patches
论文作者
论文摘要
深度神经网络已被证明容易受到对抗性例子的影响 - 在原本高度准确的图像分类器中,构建的微小,不可感知的变化。作为一种实用的替代方法,最近的工作提出了所谓的对抗斑:图像中清晰可见但对抗性的矩形斑块。这些补丁可以轻松地在物理世界中打印和应用。尽管已经广泛研究了针对不可察觉的对抗性例子的防御措施,但对对抗斑块的鲁棒性知之甚少。在这项工作中,我们首先设计了一种实用的方法来获得对抗贴片,同时积极优化其在图像中的位置。然后,我们对这些位置优化的对抗斑块进行对抗训练,并在CIFAR10和GTSRB上显示出明显提高的鲁棒性。此外,与对对抗性的对抗性训练相比,我们的对抗贴片训练并不能降低准确性。
Deep neural networks have been shown to be susceptible to adversarial examples -- small, imperceptible changes constructed to cause mis-classification in otherwise highly accurate image classifiers. As a practical alternative, recent work proposed so-called adversarial patches: clearly visible, but adversarially crafted rectangular patches in images. These patches can easily be printed and applied in the physical world. While defenses against imperceptible adversarial examples have been studied extensively, robustness against adversarial patches is poorly understood. In this work, we first devise a practical approach to obtain adversarial patches while actively optimizing their location within the image. Then, we apply adversarial training on these location-optimized adversarial patches and demonstrate significantly improved robustness on CIFAR10 and GTSRB. Additionally, in contrast to adversarial training on imperceptible adversarial examples, our adversarial patch training does not reduce accuracy.