论文标题

梯度结合的动态编程,具有下和凹的扩展价值函数

Gradient-Bounded Dynamic Programming with Submodular and Concave Extensible Value Functions

论文作者

Lebedev, Denis, Goulart, Paul, Margellos, Kostas

论文摘要

我们考虑使用有限的离散时间范围和高度高维的离散状态空间来直接计算钟尔曼方程的值函数的动态编程问题。对于动态程序的价值函数在其状态空间中是可扩展的和子模型的情况,我们提出了一种新算法,该算法计算确定性的上限和随机的值函数的函数类似于双动力学编程。然后,我们证明所提出的算法在有限数量的迭代后终止。最后,我们证明了方法在出庭送货上的交付插槽定价的高维数字示例中的功效。

We consider dynamic programming problems with finite, discrete-time horizons and prohibitively high-dimensional, discrete state-spaces for direct computation of the value function from the Bellman equation. For the case that the value function of the dynamic program is concave extensible and submodular in its state-space, we present a new algorithm that computes deterministic upper and stochastic lower bounds of the value function similar to dual dynamic programming. We then show that the proposed algorithm terminates after a finite number of iterations. Finally, we demonstrate the efficacy of our approach on a high-dimensional numerical example from delivery slot pricing in attended home delivery.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源