论文标题
DS-FACTO:双重分离的分解机器
DS-FACTO: Doubly Separable Factorization Machines
论文作者
论文摘要
分解机(FM)是强大的模型类别,可在功能之间结合高阶交互,以在线性模型中添加更具表现力的功率。它们已成功地用于几个现实世界中的任务,例如点击预测,排名和推荐系统。尽管对成对功能使用了低级别表示,但在大规模实际数据集上使用分数化机的内存开销可能是高度较高的。例如,在Criteo TERA数据集上,假设$ 128 $尺寸潜在表示和$ 10^{9} $功能,该模型的内存要求为$ 1 $ tb。此外,数据本身占$ 2.1 $ TB。在单机器上使用的FM的传统算法无法处理此量表,因此,不可避免地使用分布式算法在集群上平行计算。在这项工作中,我们提出了一种混合平行的随机优化算法DS-FACTO,该算法同时分配了数据和分解机的参数。我们的解决方案是完全偏心化的,不需要使用任何参数服务器。我们提出了经验结果,以分析DS-FACTO的收敛行为,预测能力和可伸缩性。
Factorization Machines (FM) are powerful class of models that incorporate higher-order interaction among features to add more expressive power to linear models. They have been used successfully in several real-world tasks such as click-prediction, ranking and recommender systems. Despite using a low-rank representation for the pairwise features, the memory overheads of using factorization machines on large-scale real-world datasets can be prohibitively high. For instance on the criteo tera dataset, assuming a modest $128$ dimensional latent representation and $10^{9}$ features, the memory requirement for the model is in the order of $1$ TB. In addition, the data itself occupies $2.1$ TB. Traditional algorithms for FM which work on a single-machine are not equipped to handle this scale and therefore, using a distributed algorithm to parallelize the computation across a cluster is inevitable. In this work, we propose a hybrid-parallel stochastic optimization algorithm DS-FACTO, which partitions both the data as well as parameters of the factorization machine simultaneously. Our solution is fully de-centralized and does not require the use of any parameter servers. We present empirical results to analyze the convergence behavior, predictive power and scalability of DS-FACTO.