将长矛用作盾牌：一种基于基于对抗性的对抗性示例的新型隐私技术，以防止会员推理攻击

论文标题

将长矛用作盾牌：一种基于基于对抗性的对抗性示例的新型隐私技术，以防止会员推理攻击

Use the Spear as a Shield: A Novel Adversarial Example based Privacy-Preserving Technique against Membership Inference Attacks

论文作者

Xue, Mingfu, Yuan, Chengxiang, He, Can, Wu, Zhiyu, Zhang, Yushu, Liu, Zhe, Liu, Weiqiang

论文摘要

最近，会员推理攻击对机器学习模型的机密培训数据的隐私构成了严重威胁。本文提出了一种基于对抗性示例的新型隐私保护技术（AEPPT），该技术将精心设计的对抗性扰动添加到了目标模型的预测中，以误解对抗性的成员推论模型。附加的对抗性扰动不会影响目标模型的准确性，但可以防止对手推断目标模型的训练集中是否存在特定数据。由于AEPPT仅修改目标模型的原始输出，因此所提出的方法是一般的，不需要修改或重新训练目标模型。实验结果表明，所提出的方法可以将成员推理模型的推理准确性和精度降低到50％，这几乎是随机猜测。此外，对于那些对手知道防御机制的自适应攻击，提议的AEPPT也被证明是有效的。与最先进的防御方法相比，所提出的防御可以将会员推理攻击的准确性和精度显着降低至50％（即与随机猜测相同），而目标模型的性能和实用性不会受到影响。

Recently, the membership inference attack poses a serious threat to the privacy of confidential training data of machine learning models. This paper proposes a novel adversarial example based privacy-preserving technique (AEPPT), which adds the crafted adversarial perturbations to the prediction of the target model to mislead the adversary's membership inference model. The added adversarial perturbations do not affect the accuracy of target model, but can prevent the adversary from inferring whether a specific data is in the training set of the target model. Since AEPPT only modifies the original output of the target model, the proposed method is general and does not require modifying or retraining the target model. Experimental results show that the proposed method can reduce the inference accuracy and precision of the membership inference model to 50%, which is close to a random guess. Further, for those adaptive attacks where the adversary knows the defense mechanism, the proposed AEPPT is also demonstrated to be effective. Compared with the state-of-the-art defense methods, the proposed defense can significantly degrade the accuracy and precision of membership inference attacks to 50% (i.e., the same as a random guess) while the performance and utility of the target model will not be affected.

下载PDF全文

下载文献需遵守相关版权规定

论文标题