建立一个系统的计算框架，用于在智能世界中为智能车辆的微型级别建模多代理决策

论文标题

建立一个系统的计算框架，用于在智能世界中为智能车辆的微型级别建模多代理决策

Towards a Systematic Computational Framework for Modeling Multi-Agent Decision-Making at Micro Level for Smart Vehicles in a Smart World

论文作者

Dai, Qi, Xu, Xunnong, Guo, Wen, Huang, Suzhou, Filev, Dimitar

论文摘要

我们提出了一个基于多代理的计算框架，用于在智能世界中的智能车辆的微型级别建模决策和战略互动。马尔可夫游戏和最佳响应动态的概念被严重利用。我们的目的是使框架在概念上是对一系列现实应用的概念和计算实用性的，包括自动驾驶汽车的微路计划。为此，我们首先将可能的随机游戏问题转换为密切相关的确定性问题，通过为每个代理商引入效用函数中的风险溢价。我们展示了如何通过基于最佳响应动力学的算法来解决简化确定性游戏的子游戏完美NASH平衡。为了通过有限的合理性更好地模拟人类驾驶行为，我们试图通过以有限的外观预期来替换NASH平衡条件，以进一步简化解决方案概念。此外，与新解决方案概念相对应的算法大大提高了计算效率。为了证明我们的方法如何应用于现实的交通设置，我们进行了模拟实验：在具有意外障碍的双车道高速公路上得出合并和产生行为。尽管两个溶液概念中涉及的假设差异，但派生的数值解决方案表明，内生驱动行为非常相似。我们还简要评论了如何在即将到来的工作中的许多方向上进一步扩展所提出的框架，例如使用真实的流量视频数据，用于流量策略优化的计算机制设计等等行为校准等等。

We propose a multi-agent based computational framework for modeling decision-making and strategic interaction at micro level for smart vehicles in a smart world. The concepts of Markov game and best response dynamics are heavily leveraged. Our aim is to make the framework conceptually sound and computationally practical for a range of realistic applications, including micro path planning for autonomous vehicles. To this end, we first convert the would-be stochastic game problem into a closely related deterministic one by introducing risk premium in the utility function for each individual agent. We show how the sub-game perfect Nash equilibrium of the simplified deterministic game can be solved by an algorithm based on best response dynamics. In order to better model human driving behaviors with bounded rationality, we seek to further simplify the solution concept by replacing the Nash equilibrium condition with a heuristic and adaptive optimization with finite look-ahead anticipation. In addition, the algorithm corresponding to the new solution concept drastically improves the computational efficiency. To demonstrate how our approach can be applied to realistic traffic settings, we conduct a simulation experiment: to derive merging and yielding behaviors on a double-lane highway with an unexpected barrier. Despite assumption differences involved in the two solution concepts, the derived numerical solutions show that the endogenized driving behaviors are very similar. We also briefly comment on how the proposed framework can be further extended in a number of directions in our forthcoming work, such as behavioral calibration using real traffic video data, computational mechanism design for traffic policy optimization, and so on.

下载PDF全文

下载文献需遵守相关版权规定

论文标题