重新考虑在图上的半监督学习的设置

论文标题

重新考虑在图上的半监督学习的设置

Rethinking the Setting of Semi-supervised Learning on Graphs

论文作者

Li, Ziang, Ding, Ming, Li, Weikai, Wang, Zihan, Zeng, Ziyu, Cen, Yukuo, Tang, Jie

论文摘要

我们认为，由于模型过度调整超参数的潜在风险，在图形上的半佩里学习的当前设置可能会导致不公平的比较。在本文中，我们强调了调整超参数的重要影响，该参数利用验证设置中的标签信息来提高性能。为了探索过度调整超参数的限制，我们提出了有效的方法，这是一种通过额外的超参数组在验证中充分利用标签信息的方法。有了有效性，即使是GCN，Cora也很容易获得85.8％的高精度。为了避免过度调整，我们合并了培训集以及验证集并构建I.I.D.图形基准（IGB）由4个数据集组成。每个数据集包含100 i.i.d.从大图采样的图表以减少评估差异。我们的实验表明，IGB比以前在图形上学习半私人学习的数据集更稳定。

We argue that the present setting of semisupervised learning on graphs may result in unfair comparisons, due to its potential risk of over-tuning hyper-parameters for models. In this paper, we highlight the significant influence of tuning hyper-parameters, which leverages the label information in the validation set to improve the performance. To explore the limit of over-tuning hyperparameters, we propose ValidUtil, an approach to fully utilize the label information in the validation set through an extra group of hyper-parameters. With ValidUtil, even GCN can easily get high accuracy of 85.8% on Cora. To avoid over-tuning, we merge the training set and the validation set and construct an i.i.d. graph benchmark (IGB) consisting of 4 datasets. Each dataset contains 100 i.i.d. graphs sampled from a large graph to reduce the evaluation variance. Our experiments suggest that IGB is a more stable benchmark than previous datasets for semisupervised learning on graphs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题