改进的时间前馈连接复发性神经网络

论文标题

改进的时间前馈连接复发性神经网络

An Improved Time Feedforward Connections Recurrent Neural Networks

论文作者

Wang, Jin, Zou, Yongsong, Lim, Se-Jung

论文摘要

经常性的神经网络（RNN）已被广泛应用于处理时间问题，例如洪水预测和财务数据处理。一方面，由于严格的串行依赖性，传统的RNN模型扩大了梯度问题，因此很难实现长期记忆功能。另一方面，RNNS细胞非常复杂，这将显着增加计算复杂性并在模型训练期间引起计算资源的浪费。在本文中，首先提出了改进的时间前馈连接复发性神经网络（TFC-RNNS）模型来解决梯度问题。在时间t-2处引入了一个平行分支，直接将其直接传输到时间t，而无需在时间T-1处进行非线性转换。这有效地改善了RNN的长期依赖性。然后，提出了一个名为单门复发单元（SGRU）的新型细胞结构。该单元结构可以减少RNNS细胞的参数数量，从而降低计算复杂性。接下来，将SGRU应用于TFC-RNNs作为新的TFC-SGRU模型解决了上述两个困难。最后，我们提出的TFC-SGRU的性能通过多个实验在长期记忆和抗干扰能力方面得到了验证。实验结果表明，我们提出的TFC-SGRU模型可以使用时间步骤1500捕获有用的信息，并有效地滤除噪声。关于语言处理能力，TFC-SGRU模型的精度比LSTM和GRU模型更好。

Recurrent Neural Networks (RNNs) have been widely applied to deal with temporal problems, such as flood forecasting and financial data processing. On the one hand, traditional RNNs models amplify the gradient issue due to the strict time serial dependency, making it difficult to realize a long-term memory function. On the other hand, RNNs cells are highly complex, which will significantly increase computational complexity and cause waste of computational resources during model training. In this paper, an improved Time Feedforward Connections Recurrent Neural Networks (TFC-RNNs) model was first proposed to address the gradient issue. A parallel branch was introduced for the hidden state at time t-2 to be directly transferred to time t without the nonlinear transformation at time t-1. This is effective in improving the long-term dependence of RNNs. Then, a novel cell structure named Single Gate Recurrent Unit (SGRU) was presented. This cell structure can reduce the number of parameters for RNNs cell, consequently reducing the computational complexity. Next, applying SGRU to TFC-RNNs as a new TFC-SGRU model solves the above two difficulties. Finally, the performance of our proposed TFC-SGRU was verified through several experiments in terms of long-term memory and anti-interference capabilities. Experimental results demonstrated that our proposed TFC-SGRU model can capture helpful information with time step 1500 and effectively filter out the noise. The TFC-SGRU model accuracy is better than the LSTM and GRU models regarding language processing ability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题