论文标题
Riemannian歧管上的parzen窗口近似
Parzen Window Approximation on Riemannian Manifold
论文作者
论文摘要
在图形动机学习中,标签传播在很大程度上取决于所连接数据点之间的边缘表示的数据亲和力。亲和力分配隐式地假设数据分布在歧管上。由于向高密度区域漂移,该假设可能无法成立,也可能导致不准确的度量分配。漂移影响了基于热核的亲和力与全球固定的parzen窗口丢弃了真正的邻居,或者强迫遥远的数据点成为该社区的成员。这产生了一个有偏见的亲和力矩阵。在本文中,由于邻域大小,环境尺寸,平坦度范围等的函数确定的可变的parzen窗口,因此对riemannian歧管上的数据采样不均匀引起的偏差。此外,使用了亲和力调整,以抵消对偏见不均衡采样的效果。提出了一种亲和力度量,该指标提出了不规则的采样效果,以产生准确的标签传播。关于合成和现实世界数据集的广泛实验证实,所提出的方法可显着提高分类精度,并且在图Laplacian歧管正则化方法中胜过现有的Parzen窗口估计器。
In graph motivated learning, label propagation largely depends on data affinity represented as edges between connected data points. The affinity assignment implicitly assumes even distribution of data on the manifold. This assumption may not hold and may lead to inaccurate metric assignment due to drift towards high-density regions. The drift affected heat kernel based affinity with a globally fixed Parzen window either discards genuine neighbors or forces distant data points to become a member of the neighborhood. This yields a biased affinity matrix. In this paper, the bias due to uneven data sampling on the Riemannian manifold is catered to by a variable Parzen window determined as a function of neighborhood size, ambient dimension, flatness range, etc. Additionally, affinity adjustment is used which offsets the effect of uneven sampling responsible for the bias. An affinity metric which takes into consideration the irregular sampling effect to yield accurate label propagation is proposed. Extensive experiments on synthetic and real-world data sets confirm that the proposed method increases the classification accuracy significantly and outperforms existing Parzen window estimators in graph Laplacian manifold regularization methods.