用采样软疗法对RNN-TransDucer进行记忆效率训练

论文标题

用采样软疗法对RNN-TransDucer进行记忆效率训练

Memory-Efficient Training of RNN-Transducer with Sampled Softmax

论文作者

Lee, Jaesong, Lee, Lukas, Watanabe, Shinji

论文摘要

RNN-Transducer一直是端到端自动语音识别的有前途的架构之一。尽管RNN-Transducer具有许多优势，包括其强大的准确性和对流媒体友好的物业，但其在培训期间的高度记忆消耗一直是开发的关键问题。在这项工作中，我们建议将采样的SoftMax应用于RNN-TransDucer，这在训练过程中仅需要一小部分词汇，从而节省了其内存消耗。我们进一步扩展了采样的软磁性，以优化Minibatch的内存消耗，并采用辅助CTC损失的分布来采样词汇，以提高模型的准确性。我们在Librispeech，Aishell-1和CSJ-APS上介绍了实验结果，在该结果中，采样的SoftMax大大降低了内存消耗，并且仍然保持基线模型的准确性。

RNN-Transducer has been one of promising architectures for end-to-end automatic speech recognition. Although RNN-Transducer has many advantages including its strong accuracy and streaming-friendly property, its high memory consumption during training has been a critical problem for development. In this work, we propose to apply sampled softmax to RNN-Transducer, which requires only a small subset of vocabulary during training thus saves its memory consumption. We further extend sampled softmax to optimize memory consumption for a minibatch, and employ distributions of auxiliary CTC losses for sampling vocabulary to improve model accuracy. We present experimental results on LibriSpeech, AISHELL-1, and CSJ-APS, where sampled softmax greatly reduces memory consumption and still maintains the accuracy of the baseline model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题