强大的策略搜索机器人导航

论文标题

强大的策略搜索机器人导航

Robust Policy Search for Robot Navigation

论文作者

Garcia-Barcos, Javier, Martinez-Cantin, Ruben

论文摘要

复杂的机器人导航和控制问题可以作为策略搜索问题构成。但是，不确定环境中的交互式学习可能很昂贵，需要使用数据效率的方法。贝叶斯优化是一种有效的非线性优化方法，仔细选择查询以收集有关最佳位置的信息。这是通过替代模型来实现的，该模型编码过去的信息以及用于查询选择的采集功能。贝叶斯优化对输入数据或先前假设的不确定性可能非常敏感。在这项工作中，我们既结合了鲁棒的优化和统计鲁棒性，表明两种鲁棒性都是协同作用。为了进行强大的优化，我们使用了改进的非感性贝叶斯优化版本，该版本在存在政策不确定性的情况下提供了安全且可重复的策略。我们还提供了新的理论见解。对于统计鲁棒性，我们使用自适应替代模型，并将Boltzmann选择作为随机采集方法，即使在替代建模错误的情况下，也可以保证融合并提高性能。我们介绍了几个优化基准和机器人任务的结果。

Complex robot navigation and control problems can be framed as policy search problems. However, interactive learning in uncertain environments can be expensive, requiring the use of data-efficient methods. Bayesian optimization is an efficient nonlinear optimization method where queries are carefully selected to gather information about the optimum location. This is achieved by a surrogate model, which encodes past information, and the acquisition function for query selection. Bayesian optimization can be very sensitive to uncertainty in the input data or prior assumptions. In this work, we incorporate both robust optimization and statistical robustness, showing that both types of robustness are synergistic. For robust optimization we use an improved version of unscented Bayesian optimization which provides safe and repeatable policies in the presence of policy uncertainty. We also provide new theoretical insights. For statistical robustness, we use an adaptive surrogate model and we introduce the Boltzmann selection as a stochastic acquisition method to have convergence guarantees and improved performance even with surrogate modeling errors. We present results in several optimization benchmarks and robot tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题