深度互动运动预测和计划：使用运动预测模型玩游戏

论文标题

深度互动运动预测和计划：使用运动预测模型玩游戏

Deep Interactive Motion Prediction and Planning: Playing Games with Motion Prediction Models

论文作者

Vazquez, Jose L., Liniger, Alexander, Schwarting, Wilko, Rus, Daniela, Van Gool, Luc

论文摘要

在大多数经典的自动驾驶汽车（AV）堆栈中，预测和计划层是分开的，限制了计划者对AV计划轨迹未告知的预测的反应。这项工作提出了一个模块，该模块通过游戏理论模型预测控制器（MPC）紧密耦合这些层，该模型使用新颖的交互式多代理神经网络策略作为其预测模型的一部分。在我们的环境中，MPC规划师通过以计划的状态序列告知多代理政策，从而考虑了所有周围的代理。我们方法成功的基础是设计新型的多代理政策网络的设计，该网络可以在周围代理的状态和地图信息的情况下引导车辆。策略网络是通过时间反向传播的基础真实观察数据进行隐式训练的，并通过时间向反向传播和可区分的动态模型，以向前推出轨迹。最后，我们表明，我们的多代理策略网络学会在与环境互动时学习驱动，并且在与游戏理论MPC计划者结合使用时，可以成功产生交互式行为。

In most classical Autonomous Vehicle (AV) stacks, the prediction and planning layers are separated, limiting the planner to react to predictions that are not informed by the planned trajectory of the AV. This work presents a module that tightly couples these layers via a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model. In our setting, the MPC planner considers all the surrounding agents by informing the multi-agent policy with the planned state sequence. Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information. The policy network is trained implicitly with ground-truth observation data using backpropagation through time and a differentiable dynamics model to roll out the trajectory forward in time. Finally, we show that our multi-agent policy network learns to drive while interacting with the environment, and, when combined with the game-theoretic MPC planner, can successfully generate interactive behaviors.

下载PDF全文

下载文献需遵守相关版权规定

论文标题