论文标题

FreeKD:图形神经网络的自由方向知识蒸馏

FreeKD: Free-direction Knowledge Distillation for Graph Neural Networks

论文作者

Feng, Kaituo, Li, Changsheng, Yuan, Ye, Wang, Guoren

论文摘要

知识蒸馏(KD)证明了其有效性,可以提高图形神经网络(GNNS)的性能,在该表现中,其目标是将知识从更深的教师GNN提炼为较浅的学生GNN。但是,由于众所周知的过度参数和过度光滑的问题,实际上很难培训令人满意的教师GNN,从而导致实际应用中的知识转移无效。在本文中,我们通过对GNN的加强学习(称为FreeKD)提出了第一个自由方向知识蒸馏框架,而这不再需要提供更深入的良好优化的教师GNN。我们工作的核心思想是协作建立两个较浅的GNN,以通过层次结构方式通过加强学习来交流知识。当我们观察到训练期间,一个典型的GNN模型在不同节点的表现通常更好,更差,我们设计了一种动态和自由方向的知识转移策略,该策略由两个级别的动作组成:1)节点级别的动作决定了两个网络相应节点之间知识转移的方向。然后2)结构级动作确定了要传播的节点级别生成的局部结构。从本质上讲,我们的FreeKD是一个一般且有原则的框架,可以自然与不同体系结构的GNN兼容。在五个基准数据集上进行的广泛实验表明,我们的FreeKD在很大的边距上优于两个基本GNN,并显示了其对各种GNN的功效。更令人惊讶的是,我们的FreeKD比传统的KD算法具有可比性甚至更好的性能,这些KD算法将知识从更深,更强大的教师GNN中提取。

Knowledge distillation (KD) has demonstrated its effectiveness to boost the performance of graph neural networks (GNNs), where its goal is to distill knowledge from a deeper teacher GNN into a shallower student GNN. However, it is actually difficult to train a satisfactory teacher GNN due to the well-known over-parametrized and over-smoothing issues, leading to invalid knowledge transfer in practical applications. In this paper, we propose the first Free-direction Knowledge Distillation framework via Reinforcement learning for GNNs, called FreeKD, which is no longer required to provide a deeper well-optimized teacher GNN. The core idea of our work is to collaboratively build two shallower GNNs in an effort to exchange knowledge between them via reinforcement learning in a hierarchical way. As we observe that one typical GNN model often has better and worse performances at different nodes during training, we devise a dynamic and free-direction knowledge transfer strategy that consists of two levels of actions: 1) node-level action determines the directions of knowledge transfer between the corresponding nodes of two networks; and then 2) structure-level action determines which of the local structures generated by the node-level actions to be propagated. In essence, our FreeKD is a general and principled framework which can be naturally compatible with GNNs of different architectures. Extensive experiments on five benchmark datasets demonstrate our FreeKD outperforms two base GNNs in a large margin, and shows its efficacy to various GNNs. More surprisingly, our FreeKD has comparable or even better performance than traditional KD algorithms that distill knowledge from a deeper and stronger teacher GNN.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源