论文标题
非线性动力学系统的非反应和准确学习
Non-asymptotic and Accurate Learning of Nonlinear Dynamical Systems
论文作者
论文摘要
我们考虑学习稳定系统的问题,该系统由非线性状态方程$ h_ {t+1} = ϕ(h_t,u_t;θ)+w_t $。这里$θ$是未知的系统动力学,$ h_t $是状态,$ u_t $是输入,$ w_t $是加性噪声向量。我们研究了基于梯度的算法,以从单个有限轨迹获得的样品中学习系统动力学$θ$。如果系统通过稳定输入策略运行,我们表明可以通过I.I.D近似时间依赖于时间的样本。通过使用混合时间参数通过截断参数进行示例。然后,我们为经验损失梯度的均匀收敛而制定新的保证。与现有的工作不同,我们的边界对噪声敏感,可以以高准确性和较小的样本复杂性学习地面真相动态。我们的结果共同促进了在稳定政策下对一般非线性系统的有效学习。我们将保证在各种数值实验中遵守进入非线性激活并验证我们的理论
We consider the problem of learning stabilizable systems governed by nonlinear state equation $h_{t+1}=ϕ(h_t,u_t;θ)+w_t$. Here $θ$ is the unknown system dynamics, $h_t $ is the state, $u_t$ is the input and $w_t$ is the additive noise vector. We study gradient based algorithms to learn the system dynamics $θ$ from samples obtained from a single finite trajectory. If the system is run by a stabilizing input policy, we show that temporally-dependent samples can be approximated by i.i.d. samples via a truncation argument by using mixing-time arguments. We then develop new guarantees for the uniform convergence of the gradients of empirical loss. Unlike existing work, our bounds are noise sensitive which allows for learning ground-truth dynamics with high accuracy and small sample complexity. Together, our results facilitate efficient learning of the general nonlinear system under stabilizing policy. We specialize our guarantees to entry-wise nonlinear activations and verify our theory in various numerical experiments