基于Lyapunov的强化学习状态估计器

论文标题

基于Lyapunov的强化学习状态估计器

Lyapunov-Based Reinforcement Learning State Estimator

论文作者

Hu, Liang, Wu, Chengwei, Pan, Wei

论文摘要

在本文中，我们考虑了非线性随机离散时间系统的状态估计问题。我们将Lyapunov的方法结合在控制理论和深度强化学习中，以设计状态估计器。从理论上讲，我们仅使用模型模拟的数据来证明有界估计误差的收敛性。提出了一种参与者批判性的增强学习算法，以学习由深神经网络近似的状态估计量。分析算法的收敛性。将基于Lyapunov的拟议的强化学习状态估计器与通过Monte Carlo模拟进行了许多现有的非线性过滤方法的比较，即使在某些系统不确定性（例如系统噪声中的协方差）和随机丢失的测量值中，即使在某些系统不确定性下，也显示出其在估计收敛方面的优势。据我们所知，这是首次基于估计估计绩效保证的基于强化的非线性状态估计器。

In this paper, we consider the state estimation problem for nonlinear stochastic discrete-time systems. We combine Lyapunov's method in control theory and deep reinforcement learning to design the state estimator. We theoretically prove the convergence of the bounded estimate error solely using the data simulated from the model. An actor-critic reinforcement learning algorithm is proposed to learn the state estimator approximated by a deep neural network. The convergence of the algorithm is analysed. The proposed Lyapunov-based reinforcement learning state estimator is compared with a number of existing nonlinear filtering methods through Monte Carlo simulations, showing its advantage in terms of estimate convergence even under some system uncertainties such as covariance shift in system noise and randomly missing measurements. To the best of our knowledge, this is the first reinforcement learning based nonlinear state estimator with bounded estimate error performance guarantee.

下载PDF全文

下载文献需遵守相关版权规定

论文标题