会员推理攻击的数据和模型依赖性

论文标题

会员推理攻击的数据和模型依赖性

Data and Model Dependencies of Membership Inference Attack

论文作者

Tonni, Shakila Mahjabin, Vatsalan, Dinusha, Farokhi, Farhad, Kaafar, Dali, Lu, Zhigang, Tangari, Gioacchino

论文摘要

机器学习（ML）模型已被证明容易受到会员推理攻击（MIA）的影响，该模型通过观察ML模型的预测输出来推断目标数据集中给定数据点的成员资格。尽管尚未完全理解MIA成功的关键因素，但现有的防御机制（例如使用L2正则化\ cite {10shokri2017 -Membership}和辍学层\ cite \ cite {Saleem2018ml}仅考虑该模型的过度拟合属性。在本文中，我们对数据和ML模型属性对ML技术对MIA的脆弱性的影响提供了经验分析。我们的结果揭示了MIA准确性与数据集的属性与培训模型之间的关系。特别是，我们表明影子数据集的大小，类和特征平衡以及目标数据集的熵，训练模型的配置和公平性是最具影响力的因素。根据这些实验发现，我们得出的结论是，随着模型过度拟合，多个属性共同有助于MIA成功，而不是任何单个属性。在我们的实验发现的基础上，我们建议使用这些数据和模型属性作为正规化器来保护ML模型免受MIA的影响。我们的结果表明，提出的防御机制可以将MIA准确性降低25 \％，而无需牺牲ML模型预测效用。

Machine learning (ML) models have been shown to be vulnerable to Membership Inference Attacks (MIA), which infer the membership of a given data point in the target dataset by observing the prediction output of the ML model. While the key factors for the success of MIA have not yet been fully understood, existing defense mechanisms such as using L2 regularization \cite{10shokri2017membership} and dropout layers \cite{salem2018ml} take only the model's overfitting property into consideration. In this paper, we provide an empirical analysis of the impact of both the data and ML model properties on the vulnerability of ML techniques to MIA. Our results reveal the relationship between MIA accuracy and properties of the dataset and training model in use. In particular, we show that the size of shadow dataset, the class and feature balance and the entropy of the target dataset, the configurations and fairness of the training model are the most influential factors. Based on those experimental findings, we conclude that along with model overfitting, multiple properties jointly contribute to MIA success instead of any single property. Building on our experimental findings, we propose using those data and model properties as regularizers to protect ML models against MIA. Our results show that the proposed defense mechanisms can reduce the MIA accuracy by up to 25\% without sacrificing the ML model prediction utility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题