非对称信息获取游戏

论文标题

非对称信息获取游戏

Asymmetric Information Acquisition Games

论文作者

Singh, Vartika, Kavitha, Veeraruna

论文摘要

我们考虑了一款具有部分，不对称和非古典信息的随机游戏，代理商正在尝试获取尽可能多的可用机会/锁。代理只能访问本地信息，信息更新是异步的，我们的目的是获得相关的平衡策略。我们的方法是考虑最佳的开环控制，直到信息更新为止，这允许以结构化的方式管理信念更新。代理商不断控制泊松搜索时钟的速度以获取锁，并在每次成功的收购中都会获得奖励；如果以前的所有阶段都成功，并且代理是第一个完成的阶段，则会成功。但是，他们都无法访问其他代理商的采集状态，从而导致不对称的信息游戏。使用最佳控制理论和马尔可夫决策过程（MDP）的标准工具，我们解决了双层控制问题； MDP的动态编程方程的每个阶段均使用最佳控制工具求解。最终，我们以一维动作的有限州游戏减少了无限的状态和无限维持行动。在某些特殊情况下，我们为NASH平衡提供了封闭形式的表达，并为其他一些衍生出渐近表达式。

We consider a stochastic game with partial, asymmetric and non-classical information, where the agents are trying to acquire as many available opportunities/locks as possible. Agents have access only to local information, the information updates are asynchronous and our aim is to obtain relevant equilibrium policies. Our approach is to consider optimal open-loop control until the information update, which allows managing the belief updates in a structured manner. The agents continuously control the rates of their Poisson search clocks to acquire the locks, and they get rewards at every successful acquisition; an acquisition is successful if all the previous stages are successful and if the agent is the first one to complete. However, none of them have access to the acquisition status of the other agents, leading to an asymmetric information game. Using standard tools of optimal control theory and Markov decision process (MDP) we solved a bi-level control problem; every stage of the dynamic programming equation of the MDP is solved using optimal control tools. We finally reduced the game with an infinite number of states and infinite-dimensional actions to a finite state game with one-dimensional actions. We provided closed-form expressions for Nash Equilibrium in some special cases and derived asymptotic expressions for some more.

下载PDF全文

下载文献需遵守相关版权规定

论文标题