论文标题
Horizon不确定性下的在线资源分配
Online Resource Allocation under Horizon Uncertainty
论文作者
论文摘要
我们研究随机的在线资源分配:决策者需要分配有限的资源来为随机生成的顺序分配请求,以最大程度地提高奖励。在每个时间步骤中,请求是独立于决策者所未知的分布而独立提取的。过去已经对在线资源分配及其特殊情况进行了广泛的研究,但是事先的结果至关重要地依赖于强烈的假设,即决策者事先知道了请求总数(地平线)。在许多应用程序(例如收入管理和在线广告)中,由于需求或用户流量强度的波动,请求的数量可能差异很大。在这项工作中,我们开发了在线算法,这些算法对地平线不确定性是可靠的。与已知的摩尼子设置形成鲜明对比的是,甚至没有算法可以达到独立于地平线不确定性的恒定渐近竞争比。我们介绍了双镜下降的新概括,该概括使决策者可以指定时间变化的目标消耗率的时间表,并证明相应的性能保证。我们继续提供一种快速的算法,用于计算目标消耗率的时间表,这在未知的摩尼子设置中导致近乎最佳的性能。特别是,随着地平线不确定性的增长,我们的竞争比率达到了最佳的增长率(到对数因素)。最后,我们还提供了一种将机器学习的预测纳入地平线的方法,该预测在已知和未知的地平线设置之间插值。
We study stochastic online resource allocation: a decision maker needs to allocate limited resources to stochastically-generated sequentially-arriving requests in order to maximize reward. At each time step, requests are drawn independently from a distribution that is unknown to the decision maker. Online resource allocation and its special cases have been studied extensively in the past, but prior results crucially and universally rely on the strong assumption that the total number of requests (the horizon) is known to the decision maker in advance. In many applications, such as revenue management and online advertising, the number of requests can vary widely because of fluctuations in demand or user traffic intensity. In this work, we develop online algorithms that are robust to horizon uncertainty. In sharp contrast to the known-horizon setting, no algorithm can achieve even a constant asymptotic competitive ratio that is independent of the horizon uncertainty. We introduce a novel generalization of dual mirror descent which allows the decision maker to specify a schedule of time-varying target consumption rates, and prove corresponding performance guarantees. We go on to give a fast algorithm for computing a schedule of target consumption rates that leads to near-optimal performance in the unknown-horizon setting. In particular, our competitive ratio attains the optimal rate of growth (up to logarithmic factors) as the horizon uncertainty grows large. Finally, we also provide a way to incorporate machine-learned predictions about the horizon which interpolates between the known and unknown horizon settings.