可交换的输入表示形式

论文标题

可交换的输入表示形式

Exchangeable Input Representations for Reinforcement Learning

论文作者

Mern, John, Sadigh, Dorsa, Kochenderfer, Mykel J.

论文摘要

样本效率差是许多领域中深度加强学习的主要局限性。这项工作提出了一种基于注意力的方法，可以将神经网络输入投入到一个有效的表示空间中，该空间在输入顺序的更改下是不变的。我们表明，我们提出的表示形式导致输入空间$ m！$ silly tor $ m $对象的输入。我们还表明，我们的方法能够表示对象数量数量的输入。我们的实验表明，各种任务的策略梯度方法的样本效率提高。我们表明，我们的表示使我们能够解决使用幼稚方法时原本棘手的问题。

Poor sample efficiency is a major limitation of deep reinforcement learning in many domains. This work presents an attention-based method to project neural network inputs into an efficient representation space that is invariant under changes to input ordering. We show that our proposed representation results in an input space that is a factor of $m!$ smaller for inputs of $m$ objects. We also show that our method is able to represent inputs over variable numbers of objects. Our experiments demonstrate improvements in sample efficiency for policy gradient methods on a variety of tasks. We show that our representation allows us to solve problems that are otherwise intractable when using naïve approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题