论文标题
许多世界中最好的:在线分配问题的双镜下降
The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems
论文作者
论文摘要
资源限制的在线分配问题是收入管理和在线广告中的核心问题。在这些问题中,请求在有限的视野中依次到达,对于每个请求,决策者需要选择一个消耗一定数量资源并产生奖励的操作。目的是最大化累积奖励受到总体消费的限制。在本文中,我们考虑了一个数据驱动的设置,其中使用决策者未知的输入模型生成每个请求的奖励和资源消耗。我们设计了一类算法类别,这些算法在各种输入模型中都能达到良好的性能,而又不知道它们面临哪种类型的输入。特别是,我们的算法在独立且分布的输入以及各种非平稳随机输入模型下是渐近的最佳选择,并且在输入是对抗性时,它们达到了渐近最佳的固定竞争比率。我们的算法在Lagrangian双重空间中运行:它们为每个资源保持双重乘法器,该乘数使用在线镜像下降进行更新。通过相应地选择参考功能,我们恢复了双级梯度下降和双重乘积重量更新算法。与现有的在线分配问题相比,所得的算法很简单,快速,不需要收入功能,消耗功能和动作空间的凸度。我们讨论了网络收入管理的申请,在线竞标,重复拍卖具有预算限制,在线比例匹配与高熵的匹配以及库存有限的个性化分类优化。
Online allocation problems with resource constraints are central problems in revenue management and online advertising. In these problems, requests arrive sequentially during a finite horizon and, for each request, a decision maker needs to choose an action that consumes a certain amount of resources and generates reward. The objective is to maximize cumulative rewards subject to a constraint on the total consumption of resources. In this paper, we consider a data-driven setting in which the reward and resource consumption of each request are generated using an input model that is unknown to the decision maker. We design a general class of algorithms that attain good performance in various input models without knowing which type of input they are facing. In particular, our algorithms are asymptotically optimal under independent and identically distributed inputs as well as various non-stationary stochastic input models, and they attain an asymptotically optimal fixed competitive ratio when the input is adversarial. Our algorithms operate in the Lagrangian dual space: they maintain a dual multiplier for each resource that is updated using online mirror descent. By choosing the reference function accordingly, we recover the dual sub-gradient descent and dual multiplicative weights update algorithm. The resulting algorithms are simple, fast, and do not require convexity in the revenue function, consumption function and action space, in contrast to existing methods for online allocation problems. We discuss applications to network revenue management, online bidding in repeated auctions with budget constraints, online proportional matching with high entropy, and personalized assortment optimization with limited inventory.