针对图形神经网络的组属性推断攻击

论文标题

针对图形神经网络的组属性推断攻击

Group Property Inference Attacks Against Graph Neural Networks

论文作者

Wang, Xiuling, Wang, Wendy Hui

论文摘要

随着机器学习（ML）技术的快速采用，ML模型的共享变得流行。但是，ML模型容易受到泄漏有关培训数据信息的隐私攻击。在这项工作中，我们专注于一种名为属性推理攻击（PIA）的特定类型的隐私攻击，该攻击通过访问目标ML模型来渗透培训数据的敏感属性。特别是，我们将图形神经网络（GNN）视为目标模型，而训练图中特定的节点和链接的分布是目标属性。尽管现有工作调查了针对图形属性的PIA，但尚无先前的工作研究节点和链接属性在组级别的推断。在这项工作中，我们对针对GNNS进行了首次对集团财产推理攻击（GPIA）进行首次系统研究。首先，我们考虑具有不同类型的对手知识的黑盒和白色框设置下的威胁模型的分类法，并为这些设置设计了六种不同的攻击。我们通过对三个代表性的GNN模型和三个现实图形的大量实验来评估这些攻击的有效性。我们的结果证明了这些攻击的有效性，这些攻击的准确性优于基线方法。其次，我们分析了有助于GPIA成功的基本因素，并表明在图形上有或没有目标属性训练的目标模型代表模型参数和/或模型输出的一定程度，这使对手可以推断该属性的存在。此外，我们设计了针对GPIA攻击的一组防御机制，并证明这些机制可以有效地降低攻击精度，而GNN模型准确性的损失很小。

With the fast adoption of machine learning (ML) techniques, sharing of ML models is becoming popular. However, ML models are vulnerable to privacy attacks that leak information about the training data. In this work, we focus on a particular type of privacy attacks named property inference attack (PIA) which infers the sensitive properties of the training data through the access to the target ML model. In particular, we consider Graph Neural Networks (GNNs) as the target model, and distribution of particular groups of nodes and links in the training graph as the target property. While the existing work has investigated PIAs that target at graph-level properties, no prior works have studied the inference of node and link properties at group level yet. In this work, we perform the first systematic study of group property inference attacks (GPIA) against GNNs. First, we consider a taxonomy of threat models under both black-box and white-box settings with various types of adversary knowledge, and design six different attacks for these settings. We evaluate the effectiveness of these attacks through extensive experiments on three representative GNN models and three real-world graphs. Our results demonstrate the effectiveness of these attacks whose accuracy outperforms the baseline approaches. Second, we analyze the underlying factors that contribute to GPIA's success, and show that the target model trained on the graphs with or without the target property represents some dissimilarity in model parameters and/or model outputs, which enables the adversary to infer the existence of the property. Further, we design a set of defense mechanisms against the GPIA attacks, and demonstrate that these mechanisms can reduce attack accuracy effectively with small loss on GNN model accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题