工厂：可解释的计划变压器通过对象级表示

论文标题

工厂：可解释的计划变压器通过对象级表示

PlanT: Explainable Planning Transformers via Object-Level Representations

论文作者

Renz, Katrin, Chitta, Kashyap, Mercea, Otniel-Bogdan, Koepke, A. Sophia, Akata, Zeynep, Geiger, Andreas

论文摘要

在复杂的环境中规划最佳路线需要关于周围场景的有效推理。尽管人类驾驶员优先考虑重要对象并忽略与决定无关的细节，但基于学习的计划者通常会从包含所有车辆和道路上下文信息的密集，高维网格表示中提取功能。在本文中，我们提出了一种植物，这是一种在使用标准变压器体系结构的自动驾驶背景下进行计划的新方法。植物基于模仿学习，具有紧凑的对象级输入表示。在Carla最长的6个基准上，Plant的表现优于所有先前的方法（与专家的驾驶得分相匹配），而在推理过程中的速度比基于同等像素的计划基线快5.3倍。将植物与现成的感知模块相结合，提供了一种基于传感器的驾驶系统，在驾驶得分方面比现有最新状态更好10分。此外，我们提出了一个评估协议，以量化计划者识别相关对象的能力，并提供有关其决策的见解。我们的结果表明，即使该对象在几何距离上，植物也可以专注于场景中最相关的对象。

Planning an optimal route in a complex environment requires efficient reasoning about the surrounding scene. While human drivers prioritize important objects and ignore details not relevant to the decision, learning-based planners typically extract features from dense, high-dimensional grid representations containing all vehicle and road context information. In this paper, we propose PlanT, a novel approach for planning in the context of self-driving that uses a standard transformer architecture. PlanT is based on imitation learning with a compact object-level input representation. On the Longest6 benchmark for CARLA, PlanT outperforms all prior methods (matching the driving score of the expert) while being 5.3x faster than equivalent pixel-based planning baselines during inference. Combining PlanT with an off-the-shelf perception module provides a sensor-based driving system that is more than 10 points better in terms of driving score than the existing state of the art. Furthermore, we propose an evaluation protocol to quantify the ability of planners to identify relevant objects, providing insights regarding their decision-making. Our results indicate that PlanT can focus on the most relevant object in the scene, even when this object is geometrically distant.

下载PDF全文

下载文献需遵守相关版权规定

论文标题