论文标题
使用新方法的离群值检测:量子群集
Outlier Detection Using a Novel method: Quantum Clustering
论文作者
论文摘要
我们在异常检测中提出了一个新的假设:正常数据实例通常位于几乎没有任何数据密度上波动的区域中,而在该区域通常会出现异常值,即数据密度发生剧烈波动。并基于这一假设,我们将基于密度的新方法应用于无监督的离群值检测。这种称为量子聚类(QC)的方法涉及未标记的数据处理,并构建了一个潜在的功能来找到簇和异常值的质心。实验表明,潜在函数可以有效地在数据点中清楚地找到隐藏的异常值。此外,通过使用QC,我们可以通过调整参数$σ$找到更多细微的异常值。此外,我们的方法还在两个不同的研究领域的两个数据集(空气质量检测和达尔文信函项目)上进行了评估,结果显示了我们方法的广泛适用性。
We propose a new assumption in outlier detection: Normal data instances are commonly located in the area that there is hardly any fluctuation on data density, while outliers are often appeared in the area that there is violent fluctuation on data density. And based on this hypothesis, we apply a novel density-based approach to unsupervised outlier detection. This approach, called Quantum Clustering (QC), deals with unlabeled data processing and constructs a potential function to find the centroids of clusters and the outliers. The experiments show that the potential function could clearly find the hidden outliers in data points effectively. Besides, by using QC, we could find more subtle outliers by adjusting the parameter $σ$. Moreover, our approach is also evaluated on two datasets (Air Quality Detection and Darwin Correspondence Project) from two different research areas, and the results show the wide applicability of our method.