对抗性扰动的概括能力和脆弱性：同一硬币的两个方面

论文标题

对抗性扰动的概括能力和脆弱性：同一硬币的两个方面

Generalization ability and Vulnerabilities to adversarial perturbations: Two sides of the same coin

论文作者

Lee, Jung Hoon, Vijayan, Sujith

论文摘要

深度学习的代理（DL）深度神经网络（DNNS）需要大量的平行/顺序操作，这使得很难理解它们并阻碍了适当的诊断。如果没有更好地了解DNN的内部过程，将DNN部署在高风险领域中可能会导致灾难性的失败。因此，要构建更可靠的DNN/DL，我们必须深入了解其基本决策过程。在这里，我们使用自组织图（SOM）来分析DL模型与DNNS决策相关的内部代码。我们的分析表明，靠近输入层映射到均匀代码的浅层层，靠近输出层的深层将这些均匀的代码转换为浅层代码。我们还发现了证据表明，均匀的代码可能是DNNS脆弱性侵害的基础。

Deep neural networks (DNNs), the agents of deep learning (DL), require a massive number of parallel/sequential operations, which makes it difficult to comprehend them and impedes proper diagnosis. Without better knowledge of DNNs' internal process, deploying DNNs in high-stakes domains may lead to catastrophic failures. Therefore, to build more reliable DNNs/DL, it is imperative that we gain insights into their underlying decision-making process. Here, we use the self-organizing map (SOM) to analyze DL models' internal codes associated with DNNs' decision-making. Our analyses suggest that shallow layers close to the input layer map onto homogeneous codes and that deep layers close to the output layer transform these homogeneous codes in shallow layers to diverse codes. We also found evidence indicating that homogeneous codes may underlie DNNs' vulnerabilities to adversarial perturbations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题