论文标题
在自主驾驶应用中使用一般价值函数的感知作为预测
Perception as prediction using general value functions in autonomous driving applications
论文作者
论文摘要
我们提出并展示了一个称为“感知”的框架,作为使用一般价值函数(GVF)来学习预测的自主驾驶的预测。作为预测的感知,人们学习了与行动对代理人对世界看法的影响有关的数据驱动的预测。它还提供了一种数据驱动的方法来预测其他代理商对世界的预期行为的影响,而无需明确学习其政策或意图。我们通过学习通过GVF来预测代理的前部安全性和后方安全性来证明视为为预测,该预测分别封装了对车辆前部和后方的行为的预期。安全预测是通过包含其他代理的模拟环境中的随机相互作用来学习的。我们表明,这些预测可用于在自适应巡航控制问题中与基于LQR的控制器产生类似的控制行为,并在危险危险的车辆接近时提供高级警告。这些预测是基于紧凑的基于政策的预测,可支持对遵循给定政策的长期影响的预测。我们分析了两个控制器,这些控制器使用赛车模拟器中学习的预测来了解预测的价值,并在ClearPath Jackal机器人和自动驾驶汽车平台上证明它们在现实世界中的使用。
We propose and demonstrate a framework called perception as prediction for autonomous driving that uses general value functions (GVFs) to learn predictions. Perception as prediction learns data-driven predictions relating to the impact of actions on the agent's perception of the world. It also provides a data-driven approach to predict the impact of the anticipated behavior of other agents on the world without explicitly learning their policy or intentions. We demonstrate perception as prediction by learning to predict an agent's front safety and rear safety with GVFs, which encapsulate anticipation of the behavior of the vehicle in front and in the rear, respectively. The safety predictions are learned through random interactions in a simulated environment containing other agents. We show that these predictions can be used to produce similar control behavior to an LQR-based controller in an adaptive cruise control problem as well as provide advanced warning when the vehicle behind is approaching dangerously. The predictions are compact policy-based predictions that support prediction of the long term impact on safety when following a given policy. We analyze two controllers that use the learned predictions in a racing simulator to understand the value of the predictions and demonstrate their use in the real-world on a Clearpath Jackal robot and an autonomous vehicle platform.