论文标题
学习有限的自适应可区分预测控制政策,并保证
Learning Constrained Adaptive Differentiable Predictive Control Policies With Guarantees
论文作者
论文摘要
我们提出了可区分的预测控制(DPC),这是一种学习概率性能保证的线性系统的限制神经控制策略的方法。我们采用自动差异来通过反向传播模型预测控制(MPC)损耗函数以及通过可区分的闭环系统动力学模型来惩罚来获得直接的策略梯度。我们证明了所提出的方法可以学习参数约束控制策略,以稳定具有不稳定动态,轨道时变参数以及满足非线性状态和输入约束的系统。与基于模仿学习的方法相反,我们的方法不取决于监督控制器。最重要的是,我们证明,在不失去性能的情况下,我们的方法比隐式,显式和近似MPC具有可扩展性和计算效率。 在IEEE交易中进行了自动控制交易的审查。
We present differentiable predictive control (DPC), a method for learning constrained neural control policies for linear systems with probabilistic performance guarantees. We employ automatic differentiation to obtain direct policy gradients by backpropagating the model predictive control (MPC) loss function and constraints penalties through a differentiable closed-loop system dynamics model. We demonstrate that the proposed method can learn parametric constrained control policies to stabilize systems with unstable dynamics, track time-varying references, and satisfy nonlinear state and input constraints. In contrast with imitation learning-based approaches, our method does not depend on a supervisory controller. Most importantly, we demonstrate that, without losing performance, our method is scalable and computationally more efficient than implicit, explicit, and approximate MPC. Under review at IEEE Transactions on Automatic Control.