通过快速自适应PCA进行有效的基于模型的协作过滤

论文标题

通过快速自适应PCA进行有效的基于模型的协作过滤

Efficient Model-Based Collaborative Filtering with Fast Adaptive PCA

论文作者

Ding, Xiangyun, Yu, Wenjian, Xie, Yuyang, Liu, Shenghua

论文摘要

使用快速自适应随机奇异值分解（SVD）的基于模型的协作过滤（CF）方法是针对推荐系统中的矩阵完成问题提出的。首先，提出了一个快速的自适应PCA框架，结合了固定精确的随机矩阵分解算法[1]和用于处理大型稀疏数据的加速技能。然后，提出了一种新型的自适应PCA终止机制，以自动确定在后续模型CF期间实现接近最佳预测准确性的许多潜在因素。所得的CF方法在继承高运行时效率的同时具有良好的精度。实际数据上的实验表明，提出的自适应PCA比原始的固定精确SVD方法[1]和MATLAB中的SVD快速2.7倍和6.7倍，同时保持准确性。提出的基于模型的CF方法能够有效地以20M等级处理Movielens数据，并在基于正规矩阵分解的方法[2]和快速单数值阈值方法[3]上表现出超过10倍的速度[3]。它还拥有免费参数的优势。与基于深度学习的CF方法相比，所提出的方法在计算上是更有效的，仅是边际性能损失。

A model-based collaborative filtering (CF) approach utilizing fast adaptive randomized singular value decomposition (SVD) is proposed for the matrix completion problem in recommender system. Firstly, a fast adaptive PCA frameworkis presented which combines the fixed-precision randomized matrix factorization algorithm [1] and accelerating skills for handling large sparse data. Then, a novel termination mechanism for the adaptive PCA is proposed to automatically determine a number of latent factors for achieving the near optimal prediction accuracy during the subsequent model-based CF. The resulted CF approach has good accuracy while inheriting high runtime efficiency. Experiments on real data show that, the proposed adaptive PCA is up to 2.7X and 6.7X faster than the original fixed-precision SVD approach [1] and svds in Matlab repsectively, while preserving accuracy. The proposed model-based CF approach is able to efficiently process the MovieLens data with 20M ratings and exhibits more than 10X speedup over the regularized matrix factorization based approach [2] and the fast singular value thresholding approach [3] with comparable or better accuracy. It also owns the advantage of parameter free. Compared with the deep-learning-based CF approach, the proposed approach is much more computationally efficient, with just marginal performance loss.

下载PDF全文

下载文献需遵守相关版权规定

论文标题