论文标题
高山:使用网络嵌入的活动链接预测
ALPINE: Active Link Prediction using Network Embedding
论文作者
论文摘要
许多现实世界中的问题可以被形式化为预测部分观察到的网络中的链接。例如,Facebook友谊建议,消费者产品建议以及犯罪网络中参与者之间隐藏互动的识别。几种链接预测算法,尤其是最近使用网络嵌入的算法,只能依靠网络的观察到部分来做到这一点。通常,可以查询节点对的链接状态,可以通过链接预测算法用作其他信息。不幸的是,此类查询可能是昂贵的或耗时的,要求仔细考虑哪些节点对查询。在本文中,我们估计查询任何特定节点对(用于主动学习设置)后的链接预测准确性的提高。具体而言,我们提出了高山(使用网络嵌入的活动链接预测),这是基于网络嵌入的链接预测实现此目标的第一种方法。为此,我们概括了从实验设计到这种设置的V型视为概念,以及最初在标准分类设置中开发的更基本的积极学习启发式方法。实际数据上的经验结果表明,高山是可扩展的,并且可以提高链接预测准确性,查询较少。
Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, consumer-product recommendations, and the identification of hidden interactions between actors in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on the observed part of the network. Often, the link status of a node pair can be queried, which can be used as additional information by the link prediction algorithm. Unfortunately, such queries can be expensive or time-consuming, mandating the careful consideration of which node pairs to query. In this paper we estimate the improvement in link prediction accuracy after querying any particular node pair, to use in an active learning setup. Specifically, we propose ALPINE (Active Link Prediction usIng Network Embedding), the first method to achieve this for link prediction based on network embedding. To this end, we generalized the notion of V-optimality from experimental design to this setting, as well as more basic active learning heuristics originally developed in standard classification settings. Empirical results on real data show that ALPINE is scalable, and boosts link prediction accuracy with far fewer queries.