advexpander：通过扩展文本来生成自然语言对抗性示例

论文标题

advexpander：通过扩展文本来生成自然语言对抗性示例

AdvExpander: Generating Natural Language Adversarial Examples by Expanding Text

论文作者

Shao, Zhihong, Liu, Zitao, Zhang, Jiyong, Wu, Zhongqin, Huang, Minlie

论文摘要

对抗性例子对于暴露机器学习模型的脆弱性至关重要。尽管最流行的基于替代的方法取代了原始示例中的某些字符或单词，但仅替换不足以发现模型的所有鲁棒性问题。在本文中，我们介绍了Advexpander，该方法通过扩展文本来制定新的对抗示例，这是与以前基于替代的方法互补的。我们首先利用语言规则来确定要扩展哪些成分以及可以扩展哪些类型的修饰符。然后，我们通过插入从基于CVAE的生成模型中搜索的对抗修饰符来扩展每个成分，该模型已在大规模的语料库中进行了预先训练。为了搜索对抗修饰符，我们直接在潜在空间中搜索对抗性潜在代码，而无需调整预训练的参数。为了确保我们的对抗性示例标记具有文本匹配的标签，我们还用启发式规则来限制修改。对三个分类任务进行的实验验证了advexpander的有效性以及我们的对抗性示例的有效性。 Advexpander通过文本扩展制作了一种新型的对抗示例，从而承诺揭示新的鲁棒性问题。

Adversarial examples are vital to expose the vulnerability of machine learning models. Despite the success of the most popular substitution-based methods which substitutes some characters or words in the original examples, only substitution is insufficient to uncover all robustness issues of models. In this paper, we present AdvExpander, a method that crafts new adversarial examples by expanding text, which is complementary to previous substitution-based methods. We first utilize linguistic rules to determine which constituents to expand and what types of modifiers to expand with. We then expand each constituent by inserting an adversarial modifier searched from a CVAE-based generative model which is pre-trained on a large scale corpus. To search adversarial modifiers, we directly search adversarial latent codes in the latent space without tuning the pre-trained parameters. To ensure that our adversarial examples are label-preserving for text matching, we also constrain the modifications with a heuristic rule. Experiments on three classification tasks verify the effectiveness of AdvExpander and the validity of our adversarial examples. AdvExpander crafts a new type of adversarial examples by text expansion, thereby promising to reveal new robustness issues.

下载PDF全文

下载文献需遵守相关版权规定

论文标题