论文标题
NCVIS:可视化可视化的噪声对比方法
NCVis: Noise Contrastive Approach for Scalable Visualization
论文作者
论文摘要
通过降低维数(例如T-SNE)可视化数据可视化的现代方法通常存在性能问题,禁止其应用于大量高维数据。在这项工作中,我们提出了NCVIS-基于噪声对比估计的合理统计基础,一种高性能降低方法。我们表明,NCVIS在保留其他方法的表示质量的同时,NCVIS优于最先进的技术。尤其是,拟议的方法在几分钟内成功地推出了超过100万个新闻头条的大型数据集,并以人类可读的方式介绍了基本结构。此外,它提供了与更简单的数据集上的T-SNE等经典方法一致的结果,例如手写数字的图像。我们认为,此类软件的更广泛用法可以大大简化大规模数据分析并降低进入该领域的进入障碍。
Modern methods for data visualization via dimensionality reduction, such as t-SNE, usually have performance issues that prohibit their application to large amounts of high-dimensional data. In this work, we propose NCVis -- a high-performance dimensionality reduction method built on a sound statistical basis of noise contrastive estimation. We show that NCVis outperforms state-of-the-art techniques in terms of speed while preserving the representation quality of other methods. In particular, the proposed approach successfully proceeds a large dataset of more than 1 million news headlines in several minutes and presents the underlying structure in a human-readable way. Moreover, it provides results consistent with classical methods like t-SNE on more straightforward datasets like images of hand-written digits. We believe that the broader usage of such software can significantly simplify the large-scale data analysis and lower the entry barrier to this area.