通过分解内部表示

论文标题

通过分解内部表示

Improving k-Means Clustering Performance with Disentangled Internal Representations

论文作者

Agarap, Abien Fred, Azcarraga, Arnulfo P.

论文摘要

深度聚类算法通过共同优化聚类损失和非聚类损失来结合表示和聚类的结合。在这样的方法中，深层神经网络用于表示与聚类网络一起学习。我们没有遵循此框架来提高聚类性能，而是提出了一种更简单的方法，以优化自动编码器的学习潜在代码表示。我们将纠缠定义为相对于来自不同类别或结构的点对的相对于同一类或结构的点的近对。为了测量数据点的纠缠，我们使用柔软的最近邻居损失，并通过引入退火温度因子来扩展它。使用我们提出的方法，MNIST数据集的测试聚类准确性为96.2％，时尚持续数据集为85.6％，Emnist平衡数据集的测试精度为79.2％，优于我们的基线模型。

Deep clustering algorithms combine representation learning and clustering by jointly optimizing a clustering loss and a non-clustering loss. In such methods, a deep neural network is used for representation learning together with a clustering network. Instead of following this framework to improve clustering performance, we propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder. We define entanglement as how close pairs of points from the same class or structure are, relative to pairs of points from different classes or structures. To measure the entanglement of data points, we use the soft nearest neighbor loss, and expand it by introducing an annealing temperature factor. Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST Balanced dataset, outperforming our baseline models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题