论文标题

用采样软疗法对RNN-TransDucer进行记忆效率训练

Memory-Efficient Training of RNN-Transducer with Sampled Softmax

论文作者

Lee, Jaesong, Lee, Lukas, Watanabe, Shinji

论文摘要

RNN-Transducer一直是端到端自动语音识别的有前途的架构之一。尽管RNN-Transducer具有许多优势,包括其强大的准确性和对流媒体友好的物业,但其在培训期间的高度记忆消耗一直是开发的关键问题。在这项工作中,我们建议将采样的SoftMax应用于RNN-TransDucer,这在训练过程中仅需要一小部分词汇,从而节省了其内存消耗。我们进一步扩展了采样的软磁性,以优化Minibatch的内存消耗,并采用辅助CTC损失的分布来采样词汇,以提高模型的准确性。我们在Librispeech,Aishell-1和CSJ-APS上介绍了实验结果,在该结果中,采样的SoftMax大大降低了内存消耗,并且仍然保持基线模型的准确性。

RNN-Transducer has been one of promising architectures for end-to-end automatic speech recognition. Although RNN-Transducer has many advantages including its strong accuracy and streaming-friendly property, its high memory consumption during training has been a critical problem for development. In this work, we propose to apply sampled softmax to RNN-Transducer, which requires only a small subset of vocabulary during training thus saves its memory consumption. We further extend sampled softmax to optimize memory consumption for a minibatch, and employ distributions of auxiliary CTC losses for sampling vocabulary to improve model accuracy. We present experimental results on LibriSpeech, AISHELL-1, and CSJ-APS, where sampled softmax greatly reduces memory consumption and still maintains the accuracy of the baseline model.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源