使用MEM-DFA使用随机矩阵在O（1）内存中训练DNN

论文标题

使用MEM-DFA使用随机矩阵在O（1）内存中训练DNN

Training DNNs in O(1) memory with MEM-DFA using Random Matrices

论文作者

Chu, Tien, Mykitiuk, Kamil, Szewczyk, Miron, Wiktor, Adam, Wojna, Zbigniew

论文摘要

这项工作提出了一种在训练深层神经网络时将记忆消耗降低到恒定复杂性的方法。该算法基于返回传播（BP）的生物学上更合理的替代方案：直接反馈对准（DFA）和反馈对准（FA），它们使用随机矩阵来传播误差。所提出的方法是内存有效的直接反馈对齐（MEM-DFA），使用DFA中层的较高独立性，并允许立即避免存储所有激活向量，这与标准的BP，FA和DFA不同。因此，无论神经网络中的层数如何，我们的算法的内存使用量是恒定的。该方法仅将计算成本增加到一个额外的远期均值。 MEM-DFA，BP，FA和DFA及其在各种神经网络模型上的MNIST和CIFAR-10数据集上进行了评估。我们的实验与我们的理论结果一致，并且与其他算法相比，MEM-DFA的记忆成本显着下降。

This work presents a method for reducing memory consumption to a constant complexity when training deep neural networks. The algorithm is based on the more biologically plausible alternatives of the backpropagation (BP): direct feedback alignment (DFA) and feedback alignment (FA), which use random matrices to propagate error. The proposed method, memory-efficient direct feedback alignment (MEM-DFA), uses higher independence of layers in DFA and allows avoiding storing at once all activation vectors, unlike standard BP, FA, and DFA. Thus, our algorithm's memory usage is constant regardless of the number of layers in a neural network. The method increases the computational cost only by a constant factor of one extra forward pass. The MEM-DFA, BP, FA, and DFA were evaluated along with their memory profiles on MNIST and CIFAR-10 datasets on various neural network models. Our experiments agree with our theoretical results and show a significant decrease in the memory cost of MEM-DFA compared to the other algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题