代表性问题：改善机器人技术的看法和探索

论文标题

代表性问题：改善机器人技术的看法和探索

Representation Matters: Improving Perception and Exploration for Robotics

论文作者

Wulfmeier, Markus, Byravan, Arunkumar, Hertweck, Tim, Higgins, Irina, Gupta, Ankush, Kulkarni, Tejas, Reynolds, Malcolm, Teplyashin, Denis, Hafner, Roland, Lampe, Thomas, Riedmiller, Martin

论文摘要

将高维环境观察投影到较低的结构化表示中可以大大提高具有有限数据（例如机器人技术）的域中加强学习的数据效率。可以找到一个通常有用的表示形式吗？为了回答这个问题，重要的是要了解代理商将如何使用表示形式以及这样的“良好”表示应该具有哪些属性。在本文中，我们在三个机器人任务的背景下系统地评估了许多普通学习和手工设计的表示：提起，堆叠和推动3D块。在两个用例中评估表示形式：作为代理的输入，或作为辅助任务的来源。此外，每种表示的价值是根据三种属性评估的：维度，可观察性和解开。我们可以显着提高两个用例的性能，并证明某些表示可以与模拟态作为代理输入相称。最后，我们的结果通过证明：1）维度对任务产生的重要性挑战了共同的直觉，但对于输入而言可忽略不计，2）可观察到与任务相关的方面的可观察性大多会影响输入表示用例，而3）删除措施可提供更好的辅助任务，但对输入的效益只有有限的效益。这项工作是迈向更系统地理解机器人控制中“良好”表示的一步，使从业者能够为开发新的学习或手工工程的表述做出更明智的选择。

Projecting high-dimensional environment observations into lower-dimensional structured representations can considerably improve data-efficiency for reinforcement learning in domains with limited data such as robotics. Can a single generally useful representation be found? In order to answer this question, it is important to understand how the representation will be used by the agent and what properties such a 'good' representation should have. In this paper we systematically evaluate a number of common learnt and hand-engineered representations in the context of three robotics tasks: lifting, stacking and pushing of 3D blocks. The representations are evaluated in two use-cases: as input to the agent, or as a source of auxiliary tasks. Furthermore, the value of each representation is evaluated in terms of three properties: dimensionality, observability and disentanglement. We can significantly improve performance in both use-cases and demonstrate that some representations can perform commensurate to simulator states as agent inputs. Finally, our results challenge common intuitions by demonstrating that: 1) dimensionality strongly matters for task generation, but is negligible for inputs, 2) observability of task-relevant aspects mostly affects the input representation use-case, and 3) disentanglement leads to better auxiliary tasks, but has only limited benefits for input representations. This work serves as a step towards a more systematic understanding of what makes a 'good' representation for control in robotics, enabling practitioners to make more informed choices for developing new learned or hand-engineered representations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题