行为变压器：用一块石头克隆$ k $模式

论文标题

行为变压器：用一块石头克隆$ k $模式

Behavior Transformers: Cloning $k$ modes with one stone

论文作者

Shafiullah, Nur Muhammad Mahi, Cui, Zichen Jeff, Altanzaya, Ariuntuya, Pinto, Lerrel

论文摘要

尽管行为学习近来取得了令人印象深刻的进步，但由于无法利用大型，人类生成的数据集，它落后于计算机视觉和自然语言处理。人类行为的差异很大，多种模式和人类示范通常不带有奖励标签。这些属性限制了当前方法在离线RL和行为克隆中的适用性，以从大型预收取的数据集中学习。在这项工作中，我们提出了行为变压器（BET），这是一种用多种模式建模未标记的演示数据的新技术。 BET翻新具有动作离散化的标准变压器体系结构，再加上受对象检测中偏移预测启发的多任务动作校正。这使我们能够利用现代变压器的多模式建模能力来预测多模式的连续动作。我们通过实验评估对各种机器人操作和自动驾驶行为数据集的赌注。我们表明，BET在求解已有的主要模式的同时，在求解已有的主要模式的同时，BET可以显着改善。最后，通过一项广泛的消融研究，我们分析了BET中每个关键组成部分的重要性。 BET生成的行为视频可在https://notmahi.github.io/bet上找到

While behavior learning has made impressive progress in recent times, it lags behind computer vision and natural language processing due to its inability to leverage large, human-generated datasets. Human behaviors have wide variance, multiple modes, and human demonstrations typically do not come with reward labels. These properties limit the applicability of current methods in Offline RL and Behavioral Cloning to learn from large, pre-collected datasets. In this work, we present Behavior Transformer (BeT), a new technique to model unlabeled demonstration data with multiple modes. BeT retrofits standard transformer architectures with action discretization coupled with a multi-task action correction inspired by offset prediction in object detection. This allows us to leverage the multi-modal modeling ability of modern transformers to predict multi-modal continuous actions. We experimentally evaluate BeT on a variety of robotic manipulation and self-driving behavior datasets. We show that BeT significantly improves over prior state-of-the-art work on solving demonstrated tasks while capturing the major modes present in the pre-collected datasets. Finally, through an extensive ablation study, we analyze the importance of every crucial component in BeT. Videos of behavior generated by BeT are available at https://notmahi.github.io/bet

下载PDF全文

下载文献需遵守相关版权规定

论文标题