通过基于模型的强化学习来学习飞行

论文标题

通过基于模型的强化学习来学习飞行

Learning to Fly via Deep Model-Based Reinforcement Learning

论文作者

Becker-Ehmck, Philip, Karl, Maximilian, Peters, Jan, van der Smagt, Patrick

论文摘要

在不需要工程模型的情况下，学习控制机器人是一个长期目标，有希望的多样化和新颖的应用。然而，由于其对现实世界相互作用的需求很高，强化学习对实时机器人控制的影响有限。在这项工作中，通过利用无人机动力学的概率模型，我们通过基于模型的强化学习学习了四型旋转式控制器。没有假定对飞行动态的先验知识；取而代之的是，从原始感觉输入中学到了一个固定使用的顺序潜在变量模型，作为在线过滤器中使用。控制器和价值函数通过通过生成的潜在轨迹传播随机分析梯度来完全优化。我们表明，只需在单个无人机的经验不到30分钟的情况下，就可以实现“学习飞行”，并且可以在自行构建的无人机上使用船上计算资源和传感器部署。

Learning to control robots without requiring engineered models has been a long-term goal, promising diverse and novel applications. Yet, reinforcement learning has only achieved limited impact on real-time robot control due to its high demand of real-world interactions. In this work, by leveraging a learnt probabilistic model of drone dynamics, we learn a thrust-attitude controller for a quadrotor through model-based reinforcement learning. No prior knowledge of the flight dynamics is assumed; instead, a sequential latent variable model, used generatively and as an online filter, is learnt from raw sensory input. The controller and value function are optimised entirely by propagating stochastic analytic gradients through generated latent trajectories. We show that "learning to fly" can be achieved with less than 30 minutes of experience with a single drone, and can be deployed solely using onboard computational resources and sensors, on a self-built drone.

下载PDF全文

下载文献需遵守相关版权规定

论文标题