论文标题
学习推断用户隐藏状态以进行在线顺序广告
Learning to Infer User Hidden States for Online Sequential Advertising
论文作者
论文摘要
为了在线广告中购买购买,广告商的极大兴趣是优化连续广告策略,其性能和解释性都很重要。现有的深入强化学习方法中缺乏可解释性使得不容易理解,诊断和进一步优化策略。在本文中,我们提出了我们的深层意图顺序广告(DISA)方法来解决这些问题。解释性的关键部分是了解消费者的购买意图,但是,这是无法观察的(称为隐藏状态)。在本文中,我们将此意图建模为潜在变量,并将问题提出为部分可观察到的马尔可夫决策过程(POMDP),其中根据可观察到的行为推断了基本意图。大规模的工业离线和在线实验证明了我们的方法优于几个基线。分析了推断的隐藏状态,结果证明了我们推论的合理性。
To drive purchase in online advertising, it is of the advertiser's great interest to optimize the sequential advertising strategy whose performance and interpretability are both important. The lack of interpretability in existing deep reinforcement learning methods makes it not easy to understand, diagnose and further optimize the strategy. In this paper, we propose our Deep Intents Sequential Advertising (DISA) method to address these issues. The key part of interpretability is to understand a consumer's purchase intent which is, however, unobservable (called hidden states). In this paper, we model this intention as a latent variable and formulate the problem as a Partially Observable Markov Decision Process (POMDP) where the underlying intents are inferred based on the observable behaviors. Large-scale industrial offline and online experiments demonstrate our method's superior performance over several baselines. The inferred hidden states are analyzed, and the results prove the rationality of our inference.