利用对抗性示例量化会员资格信息泄漏

论文标题

利用对抗性示例量化会员资格信息泄漏

Leveraging Adversarial Examples to Quantify Membership Information Leakage

论文作者

Del Grosso, Ganesh, Jalalzai, Hamid, Pichler, Georg, Palamidessi, Catuscia, Piantanida, Pablo

论文摘要

将个人数据用于培训机器学习系统具有隐私威胁，并衡量模型的隐私水平是当今机器学习的主要挑战之一。基于训练的模型识别培训数据是衡量模型引起的隐私风险的标准方法。我们开发了一种新颖的方法来解决模式识别模型中成员推理的问题，依赖于对抗性示例提供的信息。我们提出的策略包括测量建立对抗性示例所需的扰动的大小。确实，我们认为这个数量反映了属于培训数据的可能性。关于多元数据和一系列最新目标模型的广泛数值实验表明，我们的方法执行可比甚至均优于最先进的策略，但不需要任何其他培训样本。

The use of personal data for training machine learning systems comes with a privacy threat and measuring the level of privacy of a model is one of the major challenges in machine learning today. Identifying training data based on a trained model is a standard way of measuring the privacy risks induced by the model. We develop a novel approach to address the problem of membership inference in pattern recognition models, relying on information provided by adversarial examples. The strategy we propose consists of measuring the magnitude of a perturbation necessary to build an adversarial example. Indeed, we argue that this quantity reflects the likelihood of belonging to the training data. Extensive numerical experiments on multivariate data and an array of state-of-the-art target models show that our method performs comparable or even outperforms state-of-the-art strategies, but without requiring any additional training samples.

下载PDF全文

下载文献需遵守相关版权规定

论文标题