利用多对象场景中的本地补丁差异进行生成对抗攻击

论文标题

利用多对象场景中的本地补丁差异进行生成对抗攻击

Leveraging Local Patch Differences in Multi-Object Scenes for Generative Adversarial Attacks

论文作者

Aich, Abhishek, Li, Shasha, Song, Chengyu, Asif, M. Salman, Krishnamurthy, Srikanth V., Roy-Chowdhury, Amit K.

论文摘要

对图像分类器的最新基于生成的攻击压倒性地集中在单对象（即单个主体对象）图像上。与此类设置不同，我们解决了一个更实用的问题，即使用多对象（即多个主导对象）图像生成对抗性扰动，因为它们代表了大多数真实世界场景。我们的目标是设计一种攻击策略，可以通过利用此类图像中固有的本地补丁差异来从自然场景中学习（例如，对象“人”上的本地贴片与交通场景中的对象`bike'之间的差异）。我们的关键想法是通过使图像中每个本地贴片的受害者分类器混淆受害者分类器，将对抗性多对象图像误分类。基于此，我们提出了一种新颖的生成攻击（称为局部贴片差异或LPD攻击），其中一种新颖的对比损失函数使用上述多对象场景特征空间的局部差异来优化扰动生成器。通过各种受害者卷积神经网络的各种实验，我们表明我们的方法在不同的白色盒子和黑色盒子设置下进行评估时，我们的方法优于基线生成攻击，具有高度可转移的扰动。

State-of-the-art generative model-based attacks against image classifiers overwhelmingly focus on single-object (i.e., single dominant object) images. Different from such settings, we tackle a more practical problem of generating adversarial perturbations using multi-object (i.e., multiple dominant objects) images as they are representative of most real-world scenes. Our goal is to design an attack strategy that can learn from such natural scenes by leveraging the local patch differences that occur inherently in such images (e.g. difference between the local patch on the object `person' and the object `bike' in a traffic scene). Our key idea is to misclassify an adversarial multi-object image by confusing the victim classifier for each local patch in the image. Based on this, we propose a novel generative attack (called Local Patch Difference or LPD-Attack) where a novel contrastive loss function uses the aforesaid local differences in feature space of multi-object scenes to optimize the perturbation generator. Through various experiments across diverse victim convolutional neural networks, we show that our approach outperforms baseline generative attacks with highly transferable perturbations when evaluated under different white-box and black-box settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题