论文标题
重新考虑在图上的半监督学习的设置
Rethinking the Setting of Semi-supervised Learning on Graphs
论文作者
论文摘要
我们认为,由于模型过度调整超参数的潜在风险,在图形上的半佩里学习的当前设置可能会导致不公平的比较。在本文中,我们强调了调整超参数的重要影响,该参数利用验证设置中的标签信息来提高性能。为了探索过度调整超参数的限制,我们提出了有效的方法,这是一种通过额外的超参数组在验证中充分利用标签信息的方法。有了有效性,即使是GCN,Cora也很容易获得85.8%的高精度。 为了避免过度调整,我们合并了培训集以及验证集并构建I.I.D.图形基准(IGB)由4个数据集组成。每个数据集包含100 i.i.d.从大图采样的图表以减少评估差异。我们的实验表明,IGB比以前在图形上学习半私人学习的数据集更稳定。
We argue that the present setting of semisupervised learning on graphs may result in unfair comparisons, due to its potential risk of over-tuning hyper-parameters for models. In this paper, we highlight the significant influence of tuning hyper-parameters, which leverages the label information in the validation set to improve the performance. To explore the limit of over-tuning hyperparameters, we propose ValidUtil, an approach to fully utilize the label information in the validation set through an extra group of hyper-parameters. With ValidUtil, even GCN can easily get high accuracy of 85.8% on Cora. To avoid over-tuning, we merge the training set and the validation set and construct an i.i.d. graph benchmark (IGB) consisting of 4 datasets. Each dataset contains 100 i.i.d. graphs sampled from a large graph to reduce the evaluation variance. Our experiments suggest that IGB is a more stable benchmark than previous datasets for semisupervised learning on graphs.