线性季度零和平均场类型游戏：最佳条件和策略优化

论文标题

线性季度零和平均场类型游戏：最佳条件和策略优化

Linear-Quadratic Zero-Sum Mean-Field Type Games: Optimality Conditions and Policy Optimization

论文作者

Carmona, René, Hamidouche, Kenza, Laurière, Mathieu, Tan, Zongjun

论文摘要

在本文中，在Infinite-Horizon折扣实用程序功能下研究了具有线性动力学和二次成本的零和平均场类型游戏（ZSMFTG）。 ZSMFTG是一类游戏，在这些游戏中，两个决策者的公用事业总计为零，竞争影响了大量无法区分的代理商。特别是，研究了过渡和效用功能取决于状态，控制器的行动以及国家和行动的均值。分析了针对开环和闭环控件的最佳条件，并得出了NASH平衡策略的明确表达式。此外，针对基于模型和基于样本的框架提出了依赖于策略梯度的两种策略优化方法。在基于模型的情况下，使用模型准确地计算了梯度，而在基于样本的情况下，使用蒙特卡洛模拟估算了梯度。进行数值实验以显示实用程序函数的收敛性以及两个播放器的控件。

In this paper, zero-sum mean-field type games (ZSMFTG) with linear dynamics and quadratic cost are studied under infinite-horizon discounted utility function. ZSMFTG are a class of games in which two decision makers whose utilities sum to zero, compete to influence a large population of indistinguishable agents. In particular, the case in which the transition and utility functions depend on the state, the action of the controllers, and the mean of the state and the actions, is investigated. The optimality conditions of the game are analysed for both open-loop and closed-loop controls, and explicit expressions for the Nash equilibrium strategies are derived. Moreover, two policy optimization methods that rely on policy gradient are proposed for both model-based and sample-based frameworks. In the model-based case, the gradients are computed exactly using the model, whereas they are estimated using Monte-Carlo simulations in the sample-based case. Numerical experiments are conducted to show the convergence of the utility function as well as the two players' controls.

下载PDF全文

下载文献需遵守相关版权规定

论文标题