论文标题
用采样软疗法对RNN-TransDucer进行记忆效率训练
Memory-Efficient Training of RNN-Transducer with Sampled Softmax
论文作者
论文摘要
RNN-Transducer一直是端到端自动语音识别的有前途的架构之一。尽管RNN-Transducer具有许多优势,包括其强大的准确性和对流媒体友好的物业,但其在培训期间的高度记忆消耗一直是开发的关键问题。在这项工作中,我们建议将采样的SoftMax应用于RNN-TransDucer,这在训练过程中仅需要一小部分词汇,从而节省了其内存消耗。我们进一步扩展了采样的软磁性,以优化Minibatch的内存消耗,并采用辅助CTC损失的分布来采样词汇,以提高模型的准确性。我们在Librispeech,Aishell-1和CSJ-APS上介绍了实验结果,在该结果中,采样的SoftMax大大降低了内存消耗,并且仍然保持基线模型的准确性。
RNN-Transducer has been one of promising architectures for end-to-end automatic speech recognition. Although RNN-Transducer has many advantages including its strong accuracy and streaming-friendly property, its high memory consumption during training has been a critical problem for development. In this work, we propose to apply sampled softmax to RNN-Transducer, which requires only a small subset of vocabulary during training thus saves its memory consumption. We further extend sampled softmax to optimize memory consumption for a minibatch, and employ distributions of auxiliary CTC losses for sampling vocabulary to improve model accuracy. We present experimental results on LibriSpeech, AISHELL-1, and CSJ-APS, where sampled softmax greatly reduces memory consumption and still maintains the accuracy of the baseline model.