学习基于技能的工业机器人任务

论文标题

学习基于技能的工业机器人任务

Learning Skill-based Industrial Robot Tasks with User Priors

论文作者

Mayr, Matthias, Hvarfner, Carl, Chatzilygeroudis, Konstantinos, Nardi, Luigi, Krueger, Volker

论文摘要

机器人技能系统旨在减少机器人设置时间的新制造任务。但是，对于灵巧，接触术的任务，通常很难找到正确的技能参数。一种策略是通过允许机器人系统直接学习任务来学习这些参数。对于学习问题，机器人操作员通常可以指定参数值的类型和范围。然而，鉴于他们先前的经验，机器人操作员应该能够通过提供有关在参数空间潜在的最佳解决方案的何处的猜测来进一步帮助学习过程。有趣的是，当前的机器人学习框架中没有利用这种先验知识。我们介绍了一种结合用户先验和贝叶斯优化的方法，以便在机器人部署时间快速优化机器人工业任务。我们在模拟中学习的三个任务以及直接在真实机器人系统上学习的两个任务中学习的方法。此外，我们通过自动从良好表现的配置中自动构建先验来从相应的仿真任务中转移知识，以在真实系统上学习。为了处理潜在的任务目标，任务被建模为多目标问题。我们的结果表明，操作员的先验是用户指定和转移的，大大加速了富含帕累托阵线的发现，并且通常产生的最终性能远远超过了拟议的基线。

Robot skills systems are meant to reduce robot setup time for new manufacturing tasks. Yet, for dexterous, contact-rich tasks, it is often difficult to find the right skill parameters. One strategy is to learn these parameters by allowing the robot system to learn directly on the task. For a learning problem, a robot operator can typically specify the type and range of values of the parameters. Nevertheless, given their prior experience, robot operators should be able to help the learning process further by providing educated guesses about where in the parameter space potential optimal solutions could be found. Interestingly, such prior knowledge is not exploited in current robot learning frameworks. We introduce an approach that combines user priors and Bayesian optimization to allow fast optimization of robot industrial tasks at robot deployment time. We evaluate our method on three tasks that are learned in simulation as well as on two tasks that are learned directly on a real robot system. Additionally, we transfer knowledge from the corresponding simulation tasks by automatically constructing priors from well-performing configurations for learning on the real system. To handle potentially contradicting task objectives, the tasks are modeled as multi-objective problems. Our results show that operator priors, both user-specified and transferred, vastly accelerate the discovery of rich Pareto fronts, and typically produce final performance far superior to proposed baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题