向前变量选择可以通过Karhunen-Loève分解高斯工艺实现快速准确的动态系统识别

论文标题

向前变量选择可以通过Karhunen-Loève分解高斯工艺实现快速准确的动态系统识别

Forward variable selection enables fast and accurate dynamic system identification with Karhunen-Loève decomposed Gaussian processes

论文作者

Hayes, Kyle, Fouts, Michael W., Baheri, Ali, Mebane, David S.

论文摘要

可扩展高斯过程（GPS）的有前途的方法是Karhunen-Loève（KL）分解，其中GP内核由一组基础函数表示，这些函数是内核操作员的本本函数。这种分解的内核有可能非常快，并且不取决于选择减少的诱导点的选择。但是，KL分解导致高维度，并且可变选择变得至关重要。本文报告了一种新的前向变量选择方法，该方法是由贝叶斯平滑样条纹方差分析内核（BSS-ANOVA）在KL扩展中的基础函数的有序性质启用的，并在完全贝叶斯方法中与快速的Gibbs采样。它可以快速有效地限制术语数量，从而产生一种具有竞争精度，训练和推理时间的方法，用于低功能设置维度的表格数据集。推理速度和准确性使该方法通过将切线空间中的动力学建模为静态问题，然后使用高阶方案集成学习的动力学，使该方法对动态系统识别特别有用。这些方法在两个动态数据集上进行了证明：一个“易感性，感染，恢复”的玩具问题，以及用作强迫函数的传播性以及实验性的“级联罐”基准数据集。对时间衍生物的静态预测进行比较是由随机森林（RF），残留神经网络（RESENET）以及正交添加剂（OAK）诱导可伸缩的GP进行的，而对于时间表的预测比较，与LSTM和GRU RECRERNISER EXICTICTS进行了比较，与LSTM和GRU RECRINTRENTRENTRENTRENT SINERRENT NEARERNET网络（RNNS）以及SINDY以及SINDY均可进行。

A promising approach for scalable Gaussian processes (GPs) is the Karhunen-Loève (KL) decomposition, in which the GP kernel is represented by a set of basis functions which are the eigenfunctions of the kernel operator. Such decomposed kernels have the potential to be very fast, and do not depend on the selection of a reduced set of inducing points. However KL decompositions lead to high dimensionality, and variable selection becomes paramount. This paper reports a new method of forward variable selection, enabled by the ordered nature of the basis functions in the KL expansion of the Bayesian Smoothing Spline ANOVA kernel (BSS-ANOVA), coupled with fast Gibbs sampling in a fully Bayesian approach. It quickly and effectively limits the number of terms, yielding a method with competitive accuracies, training and inference times for tabular datasets of low feature set dimensionality. The inference speed and accuracy makes the method especially useful for dynamic systems identification, by modeling the dynamics in the tangent space as a static problem, then integrating the learned dynamics using a high-order scheme. The methods are demonstrated on two dynamic datasets: a `Susceptible, Infected, Recovered' (SIR) toy problem, with the transmissibility used as forcing function, along with the experimental `Cascaded Tanks' benchmark dataset. Comparisons on the static prediction of time derivatives are made with a random forest (RF), a residual neural network (ResNet), and the Orthogonal Additive Kernel (OAK) inducing points scalable GP, while for the timeseries prediction comparisons are made with LSTM and GRU recurrent neural networks (RNNs) along with the SINDy package.

下载PDF全文

下载文献需遵守相关版权规定

论文标题