论文标题
无监督的图形离群值检测:问题重新访问,新洞察力和优越的方法
Unsupervised Graph Outlier Detection: Problem Revisit, New Insight, and Superior Method
论文作者
论文摘要
近年来,由于其广泛的应用而出现了大量有关图形离群值检测(上帝)的研究,其中无监督的节点异常值检测(dunod)在属性网络上是一个重要领域。渠道侧重于检测图形中的两种典型异常值:结构异常值和上下文异常值。大多数现有作品基于带注射异常值的数据集进行实验。但是,我们发现最广泛使用的异常注入方法存在严重的数据泄漏问题。通过仅利用此类数据泄漏,一种简单的方法可以在检测异常值时实现最先进的性能。此外,我们观察到现有的算法在缓解数据泄漏问题的情况下具有性能下降。另一个主要问题是两种类型的异常值之间的平衡检测性能,这是现有研究尚未考虑的。在本文中,我们深入分析了数据泄漏问题的原因,因为注射方法是提前脉络的基础。此外,我们设计了一个基于方差的新型模型来检测结构异常值,该模型的表现明显优于现有算法,并且在各种注射设置中更强大。最重要的是,我们提出了一个新的框架,基于方差的图形离群值检测(VGOD),该检测结合了我们的基于方差的模型和属性重建模型,以平衡地检测异常值。最后,我们进行了广泛的实验,以证明VGOD的有效性和效率。 5个现实世界数据集中的结果验证了VGOD不仅可以在检测异常值时达到最佳性能,还可以在结构和上下文异常值之间取得平衡的检测性能。
A large number of studies on Graph Outlier Detection (GOD) have emerged in recent years due to its wide applications, in which Unsupervised Node Outlier Detection (UNOD) on attributed networks is an important area. UNOD focuses on detecting two kinds of typical outliers in graphs: the structural outlier and the contextual outlier. Most existing works conduct experiments based on datasets with injected outliers. However, we find that the most widely-used outlier injection approach has a serious data leakage issue. By only utilizing such data leakage, a simple approach can achieve state-of-the-art performance in detecting outliers. In addition, we observe that existing algorithms have a performance drop with the mitigated data leakage issue. The other major issue is on balanced detection performance between the two types of outliers, which has not been considered by existing studies. In this paper, we analyze the cause of the data leakage issue in depth since the injection approach is a building block to advance UNOD. Moreover, we devise a novel variance-based model to detect structural outliers, which outperforms existing algorithms significantly and is more robust at kinds of injection settings. On top of this, we propose a new framework, Variance based Graph Outlier Detection (VGOD), which combines our variance-based model and attribute reconstruction model to detect outliers in a balanced way. Finally, we conduct extensive experiments to demonstrate the effectiveness and efficiency of VGOD. The results on 5 real-world datasets validate that VGOD achieves not only the best performance in detecting outliers but also a balanced detection performance between structural and contextual outliers.