论文标题
通过光谱学习连接加权自动机,张量网络和经常性神经网络
Connecting Weighted Automata, Tensor Networks and Recurrent Neural Networks through Spectral Learning
论文作者
论文摘要
在本文中,我们介绍了在不同研究领域中使用的三个模型之间的连接:从形式语言和语言中的加权有限自动机〜(WFA),用于机器学习中的经常性神经网络以及张张量网络,其中包含用于量子物理学和数字分析中高级张量的一组优化技术。我们首先提出了WFA与张量火车分解之间的内在关系,这是一种特殊形式的张量网络。这种关系使我们能够表现出由WFA计算出的功能的Hankel矩阵的新型低级结构,并设计了一种有效的光谱学习算法,利用该结构将算法扩展到非常大的Hankel矩阵。然后,我们将WFA和第二阶列出的Neural网络之间的基本联系和序列序列划分(2-Recrane and ima and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and and)in。带有线性激活功能的2-RNN表达等效。利用这种等效性结果与加权自动机的经典光谱学习算法相结合,我们引入了第一个可证明的学习算法,用于在连续输入矢量的序列上定义的线性2-RNN。该算法依赖于该算法估算hankel Tensor的低等级子块,从该算法中估算了pare par par par parcyeal of lounn lunn lunn lunn lunn lunn lunn lunn lunn lunn lunn lunn。在合成数据和现实世界数据的仿真研究中评估了所提出的学习算法的性能。
In this paper, we present connections between three models used in different research fields: weighted finite automata~(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks which encompasses a set of optimization techniques for high-order tensors used in quantum physics and numerical analysis. We first present an intrinsic relation between WFA and the tensor train decomposition, a particular form of tensor network. This relation allows us to exhibit a novel low rank structure of the Hankel matrix of a function computed by a WFA and to design an efficient spectral learning algorithm leveraging this structure to scale the algorithm up to very large Hankel matrices.We then unravel a fundamental connection between WFA and second-orderrecurrent neural networks~(2-RNN): in the case of sequences of discrete symbols, WFA and 2-RNN with linear activationfunctions are expressively equivalent. Leveraging this equivalence result combined with the classical spectral learning algorithm for weighted automata, we introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous input vectors.This algorithm relies on estimating low rank sub-blocks of the Hankel tensor, from which the parameters of a linear 2-RNN can be provably recovered. The performances of the proposed learning algorithm are assessed in a simulation study on both synthetic and real-world data.