StackRec：通过迭代堆叠对非常深的顺序推荐模型的有效培训

论文标题

StackRec：通过迭代堆叠对非常深的顺序推荐模型的有效培训

StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking

论文作者

Wang, Jiachun, Yuan, Fajie, Chen, Jian, Wu, Qingyao, Yang, Min, Sun, Yang, Zhang, Guoxiao

论文摘要

深度学习为顺序推荐（SR）任务带来了巨大进步。借助高级网络体系结构，可以将许多隐藏层（例如，在现实世界推荐数据集中最多100层）堆叠的顺序推荐模型。训练这样的深网是很困难的，因为它在计算上可能非常昂贵，并且需要更长的时间，尤其是在存在数百亿用户信息交互的情况下。为了应对这一挑战，我们提出了StackRec，这是一个简单但非常有效，非常有效的训练框架，可通过迭代层堆叠为深层SR模型。具体来说，我们首先提供了一个重要的见解，即训练有素的深SR模型中的隐藏层/块具有非常相似的分布。在此启发下，我们提出了预先训练的层/块上的堆叠操作，以将知识从较浅的模型转移到深层模型，然后我们执行迭代堆叠，以产生更深层但更易于培训的SR模型。我们通过使用现实世界中数据集的三种实用场景中使用四种最先进的SR模型实例化StackRec的性能。广泛的实验表明，与经过从头开始训练的SR模型相比，StackRec不仅可以达到可比的性能，而且还可以实现训练时间的实质性加速。代码可在https://github.com/wangjiachun0426/stackrec上找到。

Deep learning has brought great progress for the sequential recommendation (SR) tasks. With advanced network architectures, sequential recommender models can be stacked with many hidden layers, e.g., up to 100 layers on real-world recommendation datasets. Training such a deep network is difficult because it can be computationally very expensive and takes much longer time, especially in situations where there are tens of billions of user-item interactions. To deal with such a challenge, we present StackRec, a simple, yet very effective and efficient training framework for deep SR models by iterative layer stacking. Specifically, we first offer an important insight that hidden layers/blocks in a well-trained deep SR model have very similar distributions. Enlightened by this, we propose the stacking operation on the pre-trained layers/blocks to transfer knowledge from a shallower model to a deep model, then we perform iterative stacking so as to yield a much deeper but easier-to-train SR model. We validate the performance of StackRec by instantiating it with four state-of-the-art SR models in three practical scenarios with real-world datasets. Extensive experiments show that StackRec achieves not only comparable performance, but also substantial acceleration in training time, compared to SR models that are trained from scratch. Codes are available at https://github.com/wangjiachun0426/StackRec.

下载PDF全文

下载文献需遵守相关版权规定

论文标题