论文标题
视觉对话框的对抗性鲁棒性
Adversarial Robustness of Visual Dialog
论文作者
论文摘要
对抗性鲁棒性评估了机器学习模型的最坏情况性能方案,以确保其安全性和可靠性。这项研究是第一个研究视觉接地对话模型对文本攻击的鲁棒性的一项。这些攻击代表了最坏的情况,其中输入问题包含一个同义词,该代名词会导致先前正确的模型返回错误的答案。使用这种情况,我们首先旨在了解多模式输入组件如何有助于模型鲁棒性。我们的结果表明,编码对话框历史记录的模型更强大,并且在对历史记录发动攻击时,模型预测变得更加不确定。这与先前的工作相反,后者发现对话记录在此任务上的模型性能可以忽略不计。我们还评估了如何生成对抗性测试示例,这些测试示例成功地欺骗了模型但仍未被用户/软件设计人员发现。我们发现,文本以及视觉上下文对于生成合理的最坏情况很重要。
Adversarial robustness evaluates the worst-case performance scenario of a machine learning model to ensure its safety and reliability. This study is the first to investigate the robustness of visually grounded dialog models towards textual attacks. These attacks represent a worst-case scenario where the input question contains a synonym which causes the previously correct model to return a wrong answer. Using this scenario, we first aim to understand how multimodal input components contribute to model robustness. Our results show that models which encode dialog history are more robust, and when launching an attack on history, model prediction becomes more uncertain. This is in contrast to prior work which finds that dialog history is negligible for model performance on this task. We also evaluate how to generate adversarial test examples which successfully fool the model but remain undetected by the user/software designer. We find that the textual, as well as the visual context are important to generate plausible worst-case scenarios.