论文标题
用于基准图形神经网络的图生成模型
Graph Generative Model for Benchmarking Graph Neural Networks
论文作者
论文摘要
随着图形神经网络(GNN)的领域继续增长,它经历了对大型现实世界数据集的需求,以训练和测试有关具有挑战性的现实问题的新GNN模型。不幸的是,这种图形数据集通常是由在线高度隐私限制的生态系统中生成的,这使这些数据集上的研发很难,即使不是不可能的。这大大减少了研究人员可用的基准图数量,从而导致该领域仅依靠少数公共可用的数据集。为了解决这个问题,我们介绍了一种新颖的图生成模型,即计算图形变压器(CGT),该模型以隐私控制的方式学习和重现了现实世界图的分布。更具体地说,CGT(1)生成有效的基准图,GNN在该图上显示与源图上类似的任务性能,(2)比例来处理大规模图,(3)结合了现成的隐私模块,以保证生成图的最终用户隐私。大量图生成模型的广泛实验表明,只有我们的模型才能成功地生成由大规模真实图形的隐私控制的合成替代物,这些图可有效地用于基准测试GNN模型。
As the field of Graph Neural Networks (GNN) continues to grow, it experiences a corresponding increase in the need for large, real-world datasets to train and test new GNN models on challenging, realistic problems. Unfortunately, such graph datasets are often generated from online, highly privacy-restricted ecosystems, which makes research and development on these datasets hard, if not impossible. This greatly reduces the amount of benchmark graphs available to researchers, causing the field to rely only on a handful of publicly-available datasets. To address this problem, we introduce a novel graph generative model, Computation Graph Transformer (CGT) that learns and reproduces the distribution of real-world graphs in a privacy-controlled way. More specifically, CGT (1) generates effective benchmark graphs on which GNNs show similar task performance as on the source graphs, (2) scales to process large-scale graphs, (3) incorporates off-the-shelf privacy modules to guarantee end-user privacy of the generated graph. Extensive experiments across a vast body of graph generative models show that only our model can successfully generate privacy-controlled, synthetic substitutes of large-scale real-world graphs that can be effectively used to benchmark GNN models.