论文标题
如何在多个更新模型上结合会员 - 推荐攻击
How to Combine Membership-Inference Attacks on Multiple Updated Models
论文作者
论文摘要
大量的研究表明,机器学习模型容易受到会员推理(MI)攻击的攻击,这些攻击侵犯了培训数据中参与者的隐私。大多数MI研究都集中在单个独立模型的情况下,而生产的机器学习平台通常会随着时间的推移更新模型,并且通常会在分销中转移,从而为攻击者提供更多信息。本文提出了新的攻击,以利用一个或多个模型更新来改善MI。我们方法的关键部分是利用独立的MI攻击中的丰富信息,分别针对原始模型和更新的模型,并以特定方式结合这些信息,以提高攻击效果。我们为每种提出了一组组合函数和调整方法,并为各种选项提供分析和定量理由。我们在四个公共数据集上的结果表明,我们的攻击有效地使用更新信息,使对手比对独立模型的攻击具有显着优势,但也与先前的MI攻击相比,该攻击利用了相关的机器未学习设置中的模型更新。我们通过模型更新对分布变化对MI攻击的影响进行了第一个测量,并表明比逐渐变化的变化更大的分配变化导致MI风险明显高。我们的代码可在https://www.github.com/stanleykywu/model-updates上找到。
A large body of research has shown that machine learning models are vulnerable to membership inference (MI) attacks that violate the privacy of the participants in the training data. Most MI research focuses on the case of a single standalone model, while production machine-learning platforms often update models over time, on data that often shifts in distribution, giving the attacker more information. This paper proposes new attacks that take advantage of one or more model updates to improve MI. A key part of our approach is to leverage rich information from standalone MI attacks mounted separately against the original and updated models, and to combine this information in specific ways to improve attack effectiveness. We propose a set of combination functions and tuning methods for each, and present both analytical and quantitative justification for various options. Our results on four public datasets show that our attacks are effective at using update information to give the adversary a significant advantage over attacks on standalone models, but also compared to a prior MI attack that takes advantage of model updates in a related machine-unlearning setting. We perform the first measurements of the impact of distribution shift on MI attacks with model updates, and show that a more drastic distribution shift results in significantly higher MI risk than a gradual shift. Our code is available at https://www.github.com/stanleykywu/model-updates.