论文标题
眼睛的眼睛:防御具有梯度的基于梯度的攻击
An Eye for an Eye: Defending against Gradient-based Attacks with Gradients
论文作者
论文摘要
深度学习模型已被证明容易受到对抗性攻击的影响。特别是,基于梯度的攻击最近显示出很高的成功率。梯度衡量每个图像像素如何影响模型输出,其中包含用于生成恶意扰动的关键信息。在本文中,我们表明梯度也可以作为防御对抗攻击的强大武器。通过将渐变图和对抗图像作为输入,我们提出了一个两流恢复网络(TRN)来恢复对抗图像。为了最佳地恢复具有两个输入流的扰动图像,提出了一种梯度图估计机制来估计对抗图像的梯度,并在TRN中设计了一个融合块来探索和融合两个流中的信息。经过培训后,我们的TRN可以防御多种攻击方法,而不会显着降低良性输入的性能。同样,我们的方法是可推广的,可扩展的,并且难以绕过。 CIFAR10,SVHN和时尚MNIST的实验结果表明,我们的方法表现优于最先进的防御方法。
Deep learning models have been shown to be vulnerable to adversarial attacks. In particular, gradient-based attacks have demonstrated high success rates recently. The gradient measures how each image pixel affects the model output, which contains critical information for generating malicious perturbations. In this paper, we show that the gradients can also be exploited as a powerful weapon to defend against adversarial attacks. By using both gradient maps and adversarial images as inputs, we propose a Two-stream Restoration Network (TRN) to restore the adversarial images. To optimally restore the perturbed images with two streams of inputs, a Gradient Map Estimation Mechanism is proposed to estimate the gradients of adversarial images, and a Fusion Block is designed in TRN to explore and fuse the information in two streams. Once trained, our TRN can defend against a wide range of attack methods without significantly degrading the performance of benign inputs. Also, our method is generalizable, scalable, and hard to bypass. Experimental results on CIFAR10, SVHN, and Fashion MNIST demonstrate that our method outperforms state-of-the-art defense methods.