论文标题
多面体分解的双图检测对抗攻击
Dual Graphs of Polyhedral Decompositions for the Detection of Adversarial Attacks
论文作者
论文摘要
先前的工作表明,具有整流线性单元(Relu)激活函数的神经网络导致输入空间的凸多面体分解。这些分解可以通过二元图表示,其顶点与polyhedra相对应,边缘和对应于Polyhedra共享一个方面的边缘,这是锤子图的子图。本文说明了如何利用双图在数字图像的背景下检测和分析对抗性攻击。当图像通过包含Relu节点的网络时,可以将节点上的触发或不发射编码为位(relu激活$ 1 $,$ 0 $ $ 0 $用于relu non-Activation)。所有位激活的序列都用位矢量识别图像,该图像在分解中用多面体标识了图像,然后在双图中用顶点识别它。我们确定偏爱位是非对抗性图像和对抗性图像之间的歧视者,并研究这些歧视者的收集能力如何在构建对抗性图像探测器的情况下集合投票。具体而言,我们使用预先训练的Resnet-50体系结构检查了对抗性图像及其非对抗性对应物的Relu位向量的相似性和差异。尽管本文着重于对抗数字图像,Resnet-50体系结构和Relu激活功能,但我们的方法扩展到其他网络架构,激活功能和数据集类型。
Previous work has shown that a neural network with the rectified linear unit (ReLU) activation function leads to a convex polyhedral decomposition of the input space. These decompositions can be represented by a dual graph with vertices corresponding to polyhedra and edges corresponding to polyhedra sharing a facet, which is a subgraph of a Hamming graph. This paper illustrates how one can utilize the dual graph to detect and analyze adversarial attacks in the context of digital images. When an image passes through a network containing ReLU nodes, the firing or non-firing at a node can be encoded as a bit ($1$ for ReLU activation, $0$ for ReLU non-activation). The sequence of all bit activations identifies the image with a bit vector, which identifies it with a polyhedron in the decomposition and, in turn, identifies it with a vertex in the dual graph. We identify ReLU bits that are discriminators between non-adversarial and adversarial images and examine how well collections of these discriminators can ensemble vote to build an adversarial image detector. Specifically, we examine the similarities and differences of ReLU bit vectors for adversarial images, and their non-adversarial counterparts, using a pre-trained ResNet-50 architecture. While this paper focuses on adversarial digital images, ResNet-50 architecture, and the ReLU activation function, our methods extend to other network architectures, activation functions, and types of datasets.