论文标题
修复ISOMAP:欧几里得代表地球结构
Rehabilitating Isomap: Euclidean Representation of Geodesic Structure
论文作者
论文摘要
非线性尺寸降低的多种学习技术假定高维特征向量位于低维歧管上,然后尝试利用歧管结构以获得数据的有用的低维欧几里得表示。 ISOMAP是一种开创性的多种学习技术,是两个简单想法的优雅综合:在图的图上,距离路径距离最短的Riemannian距离的近似,该距离局部局部歧管结构,以及最短路径距离的近似值,具有多维缩放的欧几里得距离。我们重新审视ISOMAP的理由,阐明ISOMAP的作用以及它不做什么。特别是,我们探讨了只有在通过欧几里得空间的凸区域进行参数时,才应使用ISOMAP的广泛感知。我们认为,这种看法是基于将多种学习作为参数恢复的极狭窄的解释,我们认为,更好地理解了ISOMAP是构建了地质结构的欧几里得表示。我们重新考虑了一个众所周知的例子,该例子以前被解释为ISOMAP局限性的证据,我们重新检查了对Isomap收敛性能的原始分析,得出结论,最短路径距离不需要凸度即可融合到Riemannian距离。
Manifold learning techniques for nonlinear dimension reduction assume that high-dimensional feature vectors lie on a low-dimensional manifold, then attempt to exploit manifold structure to obtain useful low-dimensional Euclidean representations of the data. Isomap, a seminal manifold learning technique, is an elegant synthesis of two simple ideas: the approximation of Riemannian distances with shortest path distances on a graph that localizes manifold structure, and the approximation of shortest path distances with Euclidean distances by multidimensional scaling. We revisit the rationale for Isomap, clarifying what Isomap does and what it does not. In particular, we explore the widespread perception that Isomap should only be used when the manifold is parametrized by a convex region of Euclidean space. We argue that this perception is based on an extremely narrow interpretation of manifold learning as parametrization recovery, and we submit that Isomap is better understood as constructing Euclidean representations of geodesic structure. We reconsider a well-known example that was previously interpreted as evidence of Isomap's limitations, and we re-examine the original analysis of Isomap's convergence properties, concluding that convexity is not required for shortest path distances to converge to Riemannian distances.