在连续建议中最大化累积用户参与：在线优化观点

论文标题

在连续建议中最大化累积用户参与：在线优化观点

Maximizing Cumulative User Engagement in Sequential Recommendation: An Online Optimization Perspective

论文作者

Zhao, Yifei, Zhou, Yu-Hang, Ou, Mingdong, Xu, Huan, Li, Nan

论文摘要

为了最大程度地提高累积的用户参与度（例如，累积点击），通常需要权衡两个潜在的可能相互矛盾的目标，也就是说，追求更高的即时用户参与度（例如，点击率汇率）并鼓励用户浏览（即，更多的项目已预期）。现有作品经常分别研究这两个任务，因此往往会产生次优的结果。在本文中，我们从在线优化的角度研究了这个问题，并提出了一个灵活且实用的框架，以明确折衷更长的用户浏览长度和高即时用户参与度。具体而言，通过将项目视为动作，用户的请求作为状态和用户作为吸收状态，我们将每个用户的行为作为个性化的马尔可夫决策过程（MDP），以及最大化累积用户参与度的问题将减少到随机的最短路径（SSP）问题。同时，通过立即进行用户参与并戒除概率估计，可以证明可以通过动态编程有效解决SSP问题。现实世界数据集的实验证明了所提出的方法的有效性。此外，这种方法部署在一个大型电子商务平台上，可实现超过7％的累积点击。

To maximize cumulative user engagement (e.g. cumulative clicks) in sequential recommendation, it is often needed to tradeoff two potentially conflicting objectives, that is, pursuing higher immediate user engagement (e.g., click-through rate) and encouraging user browsing (i.e., more items exposured). Existing works often study these two tasks separately, thus tend to result in sub-optimal results. In this paper, we study this problem from an online optimization perspective, and propose a flexible and practical framework to explicitly tradeoff longer user browsing length and high immediate user engagement. Specifically, by considering items as actions, user's requests as states and user leaving as an absorbing state, we formulate each user's behavior as a personalized Markov decision process (MDP), and the problem of maximizing cumulative user engagement is reduced to a stochastic shortest path (SSP) problem. Meanwhile, with immediate user engagement and quit probability estimation, it is shown that the SSP problem can be efficiently solved via dynamic programming. Experiments on real-world datasets demonstrate the effectiveness of the proposed approach. Moreover, this approach is deployed at a large E-commerce platform, achieved over 7% improvement of cumulative clicks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题