patchZero：通过检测和归零的补丁来防御对抗补丁攻击

论文标题

patchZero：通过检测和归零的补丁来防御对抗补丁攻击

PatchZero: Defending against Adversarial Patch Attacks by Detecting and Zeroing the Patch

论文作者

Xu, Ke, Xiao, Yao, Zheng, Zhaoheng, Cai, Kaijie, Nevatia, Ram

论文摘要

对抗斑块攻击通过在局部区域内注入对抗像素来误导神经网络。补丁攻击可以在各种任务中非常有效，并且可以通过附件（例如贴纸）对现实世界对象进行物理实现。尽管攻击模式的多样性，但对抗斑块往往具有高质感，并且外观与自然图像不同。我们利用此属性并呈现PatchZero，这是一条针对白色盒对手贴片的通用防御管道，而无需重新训练下游分类器或检测器。具体而言，我们的防御能够通过平均像素值重新粉刷在像素级和“零”斑块区域的对手。我们进一步设计了一个两阶段的对抗训练计划，以防止更强的适应性攻击。 PatchZero在图像分类（Imagenet，Resisc45），对象检测（Pascal VOC）和视频分类（UCF101）任务上实现了SOTA防御性能，而良性绩效的降解很少。此外，PatchZero将转移到不同的补丁形状和攻击类型。

Adversarial patch attacks mislead neural networks by injecting adversarial pixels within a local region. Patch attacks can be highly effective in a variety of tasks and physically realizable via attachment (e.g. a sticker) to the real-world objects. Despite the diversity in attack patterns, adversarial patches tend to be highly textured and different in appearance from natural images. We exploit this property and present PatchZero, a general defense pipeline against white-box adversarial patches without retraining the downstream classifier or detector. Specifically, our defense detects adversaries at the pixel-level and "zeros out" the patch region by repainting with mean pixel values. We further design a two-stage adversarial training scheme to defend against the stronger adaptive attacks. PatchZero achieves SOTA defense performance on the image classification (ImageNet, RESISC45), object detection (PASCAL VOC), and video classification (UCF101) tasks with little degradation in benign performance. In addition, PatchZero transfers to different patch shapes and attack types.

下载PDF全文

下载文献需遵守相关版权规定

论文标题