BPPATTACK：通过图像量化和对比度学习对深度神经网络的隐身和高效的特洛伊木马攻击

论文标题

BPPATTACK：通过图像量化和对比度学习对深度神经网络的隐身和高效的特洛伊木马攻击

BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural Networks via Image Quantization and Contrastive Adversarial Learning

论文作者

Wang, Zhenting, Zhai, Juan, Ma, Shiqing

论文摘要

深层神经网络容易受到特洛伊木马攻击的影响。现有的攻击使用可见模式（例如，补丁或图像转换）作为触发器，这些模式容易受到人类检查的影响。在本文中，我们提出了隐形，高效的特洛伊木马袭击，Bppattack。基于现有的有关人类视觉系统的生物学文献，我们建议将图像量化和抖动用作特洛伊木马触发因素，从而做出不可察觉的变化。这是无需训练辅助模型的隐形和高效攻击。由于对图像进行了微小的变化，因此很难在训练过程中注入此类触发因素。为了减轻这个问题，我们提出了一种基于对比的学习方法，该方法利用对抗性攻击产生负面样本对，以便精确而准确。提出的方法在包括MNIST，CIFAR-10，GTSRB和CELEBA在内的四个基准数据集上达到了高攻击成功率。它还有效地绕开了现有的特洛伊木马防御和人类检查。我们的代码可以在https://github.com/ru-system-software-and-security/bppattack中找到。

Deep neural networks are vulnerable to Trojan attacks. Existing attacks use visible patterns (e.g., a patch or image transformations) as triggers, which are vulnerable to human inspection. In this paper, we propose stealthy and efficient Trojan attacks, BppAttack. Based on existing biology literature on human visual systems, we propose to use image quantization and dithering as the Trojan trigger, making imperceptible changes. It is a stealthy and efficient attack without training auxiliary models. Due to the small changes made to images, it is hard to inject such triggers during training. To alleviate this problem, we propose a contrastive learning based approach that leverages adversarial attacks to generate negative sample pairs so that the learned trigger is precise and accurate. The proposed method achieves high attack success rates on four benchmark datasets, including MNIST, CIFAR-10, GTSRB, and CelebA. It also effectively bypasses existing Trojan defenses and human inspection. Our code can be found in https://github.com/RU-System-Software-and-Security/BppAttack.

下载PDF全文

下载文献需遵守相关版权规定

论文标题