自适应对抗逻辑配对

论文标题

自适应对抗逻辑配对

Adaptive Adversarial Logits Pairing

论文作者

Wu, Shangxi, Sang, Jitao, Xu, Kaiyuan, Zheng, Guanhua, Xu, Changsheng

论文摘要

对抗性示例提供了机会，并为理解图像分类系统带来了挑战。基于对对抗训练解决方案对抗逻辑配对（ALP）的分析，我们在这项工作中观察到：（1）与弱势群体相比，对抗性强大模型的推断往往依赖于更少的高分子特征。（2）ALP的训练目标不太适合样本的明显部分，在该样品中，逻辑配对损失过多地强调并阻碍最小化分类损失。在这些观察结果的激励下，我们通过修改ALP的训练过程和训练目标来设计一个自适应对抗逻辑配对（AALP）解决方案。具体而言，AALP由一个自适应特征优化模块组成，并通过设置特定于样本的训练权重来平衡逻辑配对损失和分类损失，从而系统地追求更少的高量值特征，并系统地追求更少的高分子样品加权模块。提出的AALP解决方案显示了通过大量实验在多个数据集上进行出色的防御性能。

Adversarial examples provide an opportunity as well as impose a challenge for understanding image classification systems. Based on the analysis of the adversarial training solution Adversarial Logits Pairing (ALP), we observed in this work that: (1) The inference of adversarially robust model tends to rely on fewer high-contribution features compared with vulnerable ones. (2) The training target of ALP doesn't fit well to a noticeable part of samples, where the logits pairing loss is overemphasized and obstructs minimizing the classification loss. Motivated by these observations, we design an Adaptive Adversarial Logits Pairing (AALP) solution by modifying the training process and training target of ALP. Specifically, AALP consists of an adaptive feature optimization module with Guided Dropout to systematically pursue fewer high-contribution features, and an adaptive sample weighting module by setting sample-specific training weights to balance between logits pairing loss and classification loss. The proposed AALP solution demonstrates superior defense performance on multiple datasets with extensive experiments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题