动作多样性：通用多代理连续反向最佳控制

论文标题

动作多样性：通用多代理连续反向最佳控制

Diversity in Action: General-Sum Multi-Agent Continuous Inverse Optimal Control

论文作者

Muench, Christian, Oliehoek, Frans A., Gavrila, Dariu M.

论文摘要

流量方案本质上是互动的。多个决策者预测他人的行动，并选择最大化其奖励的策略。我们从游戏理论的角度看待这些相互作用，该游戏理论引入了各种挑战。人类不是完全理性的，需要从现实世界中的数据中推断出他们的奖励，并且任何预测算法都需要实时能够，以便我们可以在自动驾驶汽车（AV）中使用它。在这项工作中，我们提出了一种游戏理论方法，该方法解决了上述所有要点。与AV所使用的许多现有方法相比，我们的方法确实1）不需要完美的沟通，而2）允许每个代理人获得个人奖励。我们的实验表明，这些更现实的假设会导致定性和定量不同的奖励推论和对未来行动的预测，这些奖励与预期的现实世界行为更好。

Traffic scenarios are inherently interactive. Multiple decision-makers predict the actions of others and choose strategies that maximize their rewards. We view these interactions from the perspective of game theory which introduces various challenges. Humans are not entirely rational, their rewards need to be inferred from real-world data, and any prediction algorithm needs to be real-time capable so that we can use it in an autonomous vehicle (AV). In this work, we present a game-theoretic method that addresses all of the points above. Compared to many existing methods used for AVs, our approach does 1) not require perfect communication, and 2) allows for individual rewards per agent. Our experiments demonstrate that these more realistic assumptions lead to qualitatively and quantitatively different reward inference and prediction of future actions that match better with expected real-world behaviour.

下载PDF全文

下载文献需遵守相关版权规定

论文标题