实时模型预测控制和系统识别使用可区分的物理模拟

论文标题

实时模型预测控制和系统识别使用可区分的物理模拟

Real-time Model Predictive Control and System Identification Using Differentiable Physics Simulation

论文作者

Chen, Sirui, Werling, Keenon, Wu, Albert, Liu, C. Karen

论文摘要

在模拟环境中开发机器人控制器是有利的，但是将控制器转移到目标环境提出了挑战，通常被称为“ SIM到SIM到真实的差距”。我们提出了一种在将机器人部署到动态变化的目标环境后持续改进建模和控制的方法。我们使用目标环境实时的传入观察结果，开发一个可区分的物理模拟框架，该框架可以同时执行在线系统标识和最佳控制。为了确保针对嘈杂观察的鲁棒系统识别，我们使用动态方程的数值分析设计了一种算法来评估我们估计参数的置信度。为了确保实时最佳控制，我们会在将来适应地安排优化窗口，以便可以比消耗更快地补充优化操作，同时尽可能了解新的传感器信息。基于不断改进的模型的恒定重新计划可以使机器人迅速适应不断变化的环境，并以最佳效率的方式利用现实世界数据。得益于一个快速可区分的物理模拟器，可以为实时运行的机器人有效地求解系统识别和控制的优化。我们在模拟中的一组示例中演示了我们的方法，并表明与基线方法相比，我们的结果是有利的。

Developing robot controllers in a simulated environment is advantageous but transferring the controllers to the target environment presents challenges, often referred to as the "sim-to-real gap". We present a method for continuous improvement of modeling and control after deploying the robot to a dynamically-changing target environment. We develop a differentiable physics simulation framework that performs online system identification and optimal control simultaneously, using the incoming observations from the target environment in real time. To ensure robust system identification against noisy observations, we devise an algorithm to assess the confidence of our estimated parameters, using numerical analysis of the dynamic equations. To ensure real-time optimal control, we adaptively schedule the optimization window in the future so that the optimized actions can be replenished faster than they are consumed, while staying as up-to-date with new sensor information as possible. The constant re-planning based on a constantly improved model allows the robot to swiftly adapt to the changing environment and utilize real-world data in the most sample-efficient way. Thanks to a fast differentiable physics simulator, the optimization for both system identification and control can be solved efficiently for robots operating in real time. We demonstrate our method on a set of examples in simulation and show that our results are favorable compared to baseline methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题