舞台上牛顿的动态游戏控制方法不完美的状态观察

论文标题

舞台上牛顿的动态游戏控制方法不完美的状态观察

Stagewise Newton Method for Dynamic Game Control with Imperfect State Observation

论文作者

Jordana, Armand, Hammoud, Bilal, Carpentier, Justin, Righetti, Ludovic

论文摘要

在这封信中，我们研究动态游戏最佳控制，并以状态观察不完善，并引入了一种迭代方法来找到局部NASH平衡。该算法由一个迭代程序组成，该过程结合了类似于Minimax差异动态编程的向后递归和类似于风险敏感的Kalman的正向递归。耦合方程可赋予所得控制，取决于估计。最后，该算法等同于牛顿步骤，但在时间范围内具有线性复杂性。此外，引入了优异函数和线路搜索过程，以确保迭代方案的收敛性。通过计划最严重的疾病，导致的控制器原因是不确定性。最后，所提出的算法的低计算成本使其成为高频对复杂系统进行输出反馈模型预测控制的有前途的方法。对现实机器人问题的数值模拟说明了所得控制器的风险敏感行为。

In this letter, we study dynamic game optimal control with imperfect state observations and introduce an iterative method to find a local Nash equilibrium. The algorithm consists of an iterative procedure combining a backward recursion similar to minimax differential dynamic programming and a forward recursion resembling a risk-sensitive Kalman smoother. A coupling equation renders the resulting control dependent on the estimation. In the end, the algorithm is equivalent to a Newton step but has linear complexity in the time horizon length. Furthermore, a merit function and a line search procedure are introduced to guarantee convergence of the iterative scheme. The resulting controller reasons about uncertainty by planning for the worst case disturbances. Lastly, the low computational cost of the proposed algorithm makes it a promising method to do output-feedback model predictive control on complex systems at high frequency. Numerical simulations on realistic robotic problems illustrate the risk-sensitive behavior of the resulting controller.

下载PDF全文

下载文献需遵守相关版权规定

论文标题