论文标题
M2TREC:用于大规模和冷启动的基于免费会话的建议的元数据了解的多任务变压器
M2TRec: Metadata-aware Multi-task Transformer for Large-scale and Cold-start free Session-based Recommendations
论文作者
论文摘要
基于会话的推荐系统(SBRS)表现出优于常规方法的性能。但是,它们在大规模工业数据集上显示出有限的可伸缩性,因为大多数模型都会学习一个嵌入每个项目。这导致了巨大的记忆要求(每项存储一个矢量),并且在稀疏的会话上具有寒冷或不受欢迎的项目的稀疏会议性能。使用一个公共和一个大型工业数据集,我们在实验上表明,最先进的SBRS在稀疏项目的稀疏会议上的性能较低。我们提出了M2Trec,这是一种基于会话建议的元数据感知的多任务变压器模型。我们提出的方法学习了从项目元数据到嵌入的转换函数,因此,不需要项目ID(即,不需要学习一个嵌入每个项目)。它集成了项目元数据以了解各种项目属性的共享表示。在推论期间,将为与先前在培训期间观察到的项目共享的属性分配新的或不受欢迎的项目,因此将与这些项目具有相似的表示,从而提出甚至冷启动和稀疏项目的建议。此外,在多任务设置中对M2TREC进行了培训,以预测会话中的下一个项目及其主要类别和子类别。我们的多任务策略使该模型收敛速度更快,并显着改善了整体性能。实验结果表明,使用我们在两个数据集上稀疏项目上提出的方法进行了显着的性能增长。
Session-based recommender systems (SBRSs) have shown superior performance over conventional methods. However, they show limited scalability on large-scale industrial datasets since most models learn one embedding per item. This leads to a large memory requirement (of storing one vector per item) and poor performance on sparse sessions with cold-start or unpopular items. Using one public and one large industrial dataset, we experimentally show that state-of-the-art SBRSs have low performance on sparse sessions with sparse items. We propose M2TRec, a Metadata-aware Multi-task Transformer model for session-based recommendations. Our proposed method learns a transformation function from item metadata to embeddings, and is thus, item-ID free (i.e., does not need to learn one embedding per item). It integrates item metadata to learn shared representations of diverse item attributes. During inference, new or unpopular items will be assigned identical representations for the attributes they share with items previously observed during training, and thus will have similar representations with those items, enabling recommendations of even cold-start and sparse items. Additionally, M2TRec is trained in a multi-task setting to predict the next item in the session along with its primary category and subcategories. Our multi-task strategy makes the model converge faster and significantly improves the overall performance. Experimental results show significant performance gains using our proposed approach on sparse items on the two datasets.