朝着对视觉训练预训练模型的对抗性攻击

论文标题

朝着对视觉训练预训练模型的对抗性攻击

Towards Adversarial Attack on Vision-Language Pre-training Models

论文作者

Zhang, Jiaming, Yi, Qi, Sang, Jitao

论文摘要

虽然视觉语言预训练模型（VLP）显示了各种视觉语言（V+L）任务的革命性改进，但有关其对抗性鲁棒性的研究仍然在很大程度上尚未探索。本文研究了对流行VLP模型和V+L任务的对抗性攻击。首先，我们分析了不同设置下对抗性攻击的性能。通过检查不同扰动对象和攻击目标的影响，我们得出了一些关键观察，作为设计强大的多模式对抗攻击和构建强大的VLP模型的指导。其次，我们在称为协作多模式对抗攻击（共攻击）的VLP模型上提出了一种新颖的多模式攻击方法，该模型集体对图像模式和文本模式进行了攻击。实验结果表明，所提出的方法可以改善对不同V+L下游任务和VLP模型的攻击性能。分析观察和新颖的攻击方法有望为VLP模型的对抗性鲁棒性提供新的理解，从而在更真实的情况下为他们的安全可靠的部署做出贡献。代码可在https://github.com/prestarial-for-goodness/co-attack上找到。

While vision-language pre-training model (VLP) has shown revolutionary improvements on various vision-language (V+L) tasks, the studies regarding its adversarial robustness remain largely unexplored. This paper studied the adversarial attack on popular VLP models and V+L tasks. First, we analyzed the performance of adversarial attacks under different settings. By examining the influence of different perturbed objects and attack targets, we concluded some key observations as guidance on both designing strong multimodal adversarial attack and constructing robust VLP models. Second, we proposed a novel multimodal attack method on the VLP models called Collaborative Multimodal Adversarial Attack (Co-Attack), which collectively carries out the attacks on the image modality and the text modality. Experimental results demonstrated that the proposed method achieves improved attack performances on different V+L downstream tasks and VLP models. The analysis observations and novel attack method hopefully provide new understanding into the adversarial robustness of VLP models, so as to contribute their safe and reliable deployment in more real-world scenarios. Code is available at https://github.com/adversarial-for-goodness/Co-Attack.

下载PDF全文

下载文献需遵守相关版权规定

论文标题