动态背包优化针对有效的多通道顺序广告

论文标题

动态背包优化针对有效的多通道顺序广告

Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising

论文作者

Hao, Xiaotian, Peng, Zhaoqing, Ma, Yi, Wang, Guan, Jin, Junqi, Hao, Jianye, Chen, Shan, Bai, Rongquan, Xie, Mingzhou, Xu, Miao, Zheng, Zhenzhe, Yu, Chuan, Li, Han, Xu, Jian, Gai, Kun

论文摘要

在电子商务中，广告对于商人触及目标用户至关重要。典型的目标是在预算限制下的一段时间内最大化广告商的累积收入。在实际应用程序中，通常需要多次接触广告（AD），直到用户最终贡献收入（例如，下订单）。但是，现有的广告系统主要集中于单一广告接触的立即收入，而忽略了每次接触最终转换的贡献，因此通常属于次优的解决方案。在本文中，我们将顺序广告策略优化作为动态背包问题。我们提出了一个理论上保证的双层优化框架，该框架大大降低了原始优化空间的解决方案空间，同时确保解决方案质量。为了提高强化学习的勘探效率，我们还设计了一种有效的动作空间方法。广泛的离线和在线实验表明，就累计收入而言，我们的方法优于最先进的基线。

In E-commerce, advertising is essential for merchants to reach their target users. The typical objective is to maximize the advertiser's cumulative revenue over a period of time under a budget constraint. In real applications, an advertisement (ad) usually needs to be exposed to the same user multiple times until the user finally contributes revenue (e.g., places an order). However, existing advertising systems mainly focus on the immediate revenue with single ad exposures, ignoring the contribution of each exposure to the final conversion, thus usually falls into suboptimal solutions. In this paper, we formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space while ensuring the solution quality. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach. Extensive offline and online experiments show the superior performance of our approaches over state-of-the-art baselines in terms of cumulative revenue.

下载PDF全文

下载文献需遵守相关版权规定

论文标题