石墨：生成机器学习攻击对计算机视觉系统的自动实例示例

论文标题

石墨：生成机器学习攻击对计算机视觉系统的自动实例示例

GRAPHITE: Generating Automatic Physical Examples for Machine-Learning Attacks on Computer Vision Systems

论文作者

Feng, Ryan, Mangaokar, Neal, Chen, Jiefeng, Fernandes, Earlence, Jha, Somesh, Prakash, Atul

论文摘要

本文调查了对手在为现实世界中产生对抗性示例时的易于攻击。我们解决了现实世界中实用攻击的三个关键要求：1）自动限制攻击的大小和形状，因此可以用贴纸应用，2）转换稳定性，即，攻击对环境物理变化的稳健性，对观点和照明的变化，例如诸如视野变化，以及3）支持boxbox boxbox bockers blackbox offersiasiers of Blackbox backel seneirials of Blackel-babel Seaciors，同样可以攻击。在这项工作中，我们提出了石墨，这是一个有效而通用的框架，用于生成满足上述三个关键要求的攻击。石墨利用转换式 - 基于对变换（EOT）的期望的度量，可以自动生成小掩码，并通过无梯度优化进行优化。石墨也很灵活，因为它可以轻松地在黑框设置中折衷变换，扰动大小和查询计数。在硬标签黑色盒子设置的GTSRB模型上，我们能够在所有可能的1,806个受害者目标级别对上找到攻击，平均值为77.8％的变换型，扰动大小为受害者图像的16.63％，每对126K查询。对于仅需数字攻击而不是数字攻击，就不需要实现变换的攻击，石墨能够找到成功的天线攻击，平均只有566个查询，其中92.2％的受害者目标对。 Graphite还能够使用扰动来找到成功的攻击，这些扰动将输入图像的小区域针对PatchGuard进行，这是针对基于补丁的攻击的最近提议的防御。

This paper investigates an adversary's ease of attack in generating adversarial examples for real-world scenarios. We address three key requirements for practical attacks for the real-world: 1) automatically constraining the size and shape of the attack so it can be applied with stickers, 2) transform-robustness, i.e., robustness of a attack to environmental physical variations such as viewpoint and lighting changes, and 3) supporting attacks in not only white-box, but also black-box hard-label scenarios, so that the adversary can attack proprietary models. In this work, we propose GRAPHITE, an efficient and general framework for generating attacks that satisfy the above three key requirements. GRAPHITE takes advantage of transform-robustness, a metric based on expectation over transforms (EoT), to automatically generate small masks and optimize with gradient-free optimization. GRAPHITE is also flexible as it can easily trade-off transform-robustness, perturbation size, and query count in black-box settings. On a GTSRB model in a hard-label black-box setting, we are able to find attacks on all possible 1,806 victim-target class pairs with averages of 77.8% transform-robustness, perturbation size of 16.63% of the victim images, and 126K queries per pair. For digital-only attacks where achieving transform-robustness is not a requirement, GRAPHITE is able to find successful small-patch attacks with an average of only 566 queries for 92.2% of victim-target pairs. GRAPHITE is also able to find successful attacks using perturbations that modify small areas of the input image against PatchGuard, a recently proposed defense against patch-based attacks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题