论文标题
DAST:无数据替代培训对抗攻击
DaST: Data-free Substitute Training for Adversarial Attacks
论文作者
论文摘要
机器学习模型容易受到对抗性示例的影响。对于黑框设置,当前的替代攻击需要预先训练的模型来生成对抗性示例。但是,在现实世界任务中很难获得预训练的模型。在本文中,我们提出了一种无数据替代培训方法(DAST),以获取无需任何实际数据的对抗黑盒攻击的替代模型。为了实现这一目标,Dast利用了专门设计的生成对抗网络(GAN)来训练替代模型。特别是,我们为生成模型设计了一个多分支结构和标签控制损失,以处理合成样品的不均匀分布。然后,替代模型由生成模型生成的合成样品训练,该模型随后被攻击模型标记。该实验证明了由DAST产生的替代模型可以实现竞争性能,而基线模型是由使用攻击模型的同一火车训练的基线模型。此外,为了评估在现实世界任务上提出的方法的实用性,我们在Microsoft Azure平台上攻击了在线机器学习模型。远程模型错误地分类了我们方法制定的对抗性例子中的98.35%。据我们所知,我们是第一个在没有任何真实数据的情况下训练替代模型进行对抗攻击的人。
Machine learning models are vulnerable to adversarial examples. For the black-box setting, current substitute attacks need pre-trained models to generate adversarial examples. However, pre-trained models are hard to obtain in real-world tasks. In this paper, we propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks without the requirement of any real data. To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models. In particular, we design a multi-branch architecture and label-control loss for the generative model to deal with the uneven distribution of synthetic samples. The substitute model is then trained by the synthetic samples generated by the generative model, which are labeled by the attacked model subsequently. The experiments demonstrate the substitute models produced by DaST can achieve competitive performance compared with the baseline models which are trained by the same train set with attacked models. Additionally, to evaluate the practicability of the proposed method on the real-world task, we attack an online machine learning model on the Microsoft Azure platform. The remote model misclassifies 98.35% of the adversarial examples crafted by our method. To the best of our knowledge, we are the first to train a substitute model for adversarial attacks without any real data.