控制器调整的加权最大似然

论文标题

控制器调整的加权最大似然

Weighted Maximum Likelihood for Controller Tuning

论文作者

Romero, Angel, Govil, Shreedhar, Yilmaz, Gonca, Song, Yunlong, Scaramuzza, Davide

论文摘要

最近，模型预测性轮廓控制（MPCC）已成为基于模型敏捷飞行的最新方法。 MPCC在不依赖全球优化的轨迹的情况下，在进度最大化和随后的进度最大化和跟随路径之后的交易中受益。但是，找到MPCC的最佳调整参数集很具有挑战性，因为（i）完整的四型动力学是非线性的，（ii）成本函数是高度非凸的，并且（iii）是高参数空间的高维度。本文利用概率策略搜索方法 - 加权最大似然（WML） - 自动学习MPCC的最佳目标。 WML由于其封闭形式的解决方案而用于更新学习参数，因此WML具有样本效率。此外，使用基于模型的方法提供的数据效率使我们能够直接在高保真模拟器中训练，这反过来又使我们的方法能够将零照片传输到现实世界。我们在现实世界中验证了我们的方法，在那里我们表明我们的方法的表现都优于先前的手动调节器和最新的自动调整基线，达到75 km/h的速度。

Recently, Model Predictive Contouring Control (MPCC) has arisen as the state-of-the-art approach for model-based agile flight. MPCC benefits from great flexibility in trading-off between progress maximization and path following at runtime without relying on globally optimized trajectories. However, finding the optimal set of tuning parameters for MPCC is challenging because (i) the full quadrotor dynamics are non-linear, (ii) the cost function is highly non-convex, and (iii) of the high dimensionality of the hyperparameter space. This paper leverages a probabilistic Policy Search method - Weighted Maximum Likelihood (WML)- to automatically learn the optimal objective for MPCC. WML is sample-efficient due to its closed-form solution for updating the learning parameters. Additionally, the data efficiency provided by the use of a model-based approach allows us to directly train in a high-fidelity simulator, which in turn makes our approach able to transfer zero-shot to the real world. We validate our approach in the real world, where we show that our method outperforms both the previous manually tuned controller and the state-of-the-art auto-tuning baseline reaching speeds of 75 km/h.

下载PDF全文

下载文献需遵守相关版权规定

论文标题