RMBench：为机器人操纵器控制的基准测试

论文标题

RMBench：为机器人操纵器控制的基准测试

RMBench: Benchmarking Deep Reinforcement Learning for Robotic Manipulator Control

论文作者

Xiang, Yanfei, Wang, Xin, Hu, Shu, Zhu, Bin, Huang, Xiaomeng, Wu, Xi, Lyu, Siwei

论文摘要

加强学习用于从高维感官输入中求解实际的复杂任务。在过去的十年中，已经开发了一系列的增强学习算法。最新的进度从深度学习中受益于原始感觉信号表示。自然出现的一个问题是：他们对不同的机器人操纵任务的表现如何？基准测试使用客观的性能指标来提供一种比较算法的科学方法。在本文中，我们提出了RMBench，这是用于机器人操作的第一个基准，具有高维的连续作用和状态空间。我们实施和评估直接使用观察到的像素作为输入的强化学习算法。我们报告了他们的平均表现和学习曲线，以显示他们的训练表现和稳定性。我们的研究得出的结论是，所研究的算法都无法很好地处理所有任务，软演员批判性的大多数算法在平均奖励和稳定性方面的表现都优于大多数算法，并且算法与数据增强相结合可以促进学习政策。我们的代码可在https://github.com/xiangyanfei212/rmbench-2022上公开获取，包括所有基准任务和研究算法。

Reinforcement learning is applied to solve actual complex tasks from high-dimensional, sensory inputs. The last decade has developed a long list of reinforcement learning algorithms. Recent progress benefits from deep learning for raw sensory signal representation. One question naturally arises: how well do they perform concerning different robotic manipulation tasks? Benchmarks use objective performance metrics to offer a scientific way to compare algorithms. In this paper, we present RMBench, the first benchmark for robotic manipulations, which have high-dimensional continuous action and state spaces. We implement and evaluate reinforcement learning algorithms that directly use observed pixels as inputs. We report their average performance and learning curves to show their performance and stability of training. Our study concludes that none of the studied algorithms can handle all tasks well, soft Actor-Critic outperforms most algorithms in average reward and stability, and an algorithm combined with data augmentation may facilitate learning policies. Our code is publicly available at https://github.com/xiangyanfei212/RMBench-2022, including all benchmark tasks and studied algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题