论文标题

低潜伏期ASR用于同时语音翻译

Low Latency ASR for Simultaneous Speech Translation

论文作者

Nguyen, Thai Son, Niehues, Jan, Cho, Eunah, Ha, Thanh-Le, Kilgour, Kevin, Muller, Markus, Sperber, Matthias, Stueker, Sebastian, Waibel, Alex

论文摘要

用户研究表明,减少同时讲座翻译系统的潜伏期应该是最重要的目标。因此,我们研究了几种技术,以减少组件的潜伏期,自动语音识别和语音翻译模块。由于在我们的连续流解码的情况下,常用的承诺潜伏期不合适,因此我们专注于单词延迟。我们用它来分析当前系统的性能并确定改进的机会。为了最大程度地减少潜伏期,我们将跑步解码与一种用于识别稳定的部分假设的技术结合在一起时,在流解码和动态输出更新协议时,可以修改转录的最新部分。这种组合减少了单词级别的延迟,其中单词是最终的,并且将来将永远不会更新,从18.1 s到1.1,而不会在单词错误率方面牺牲性能。

User studies have shown that reducing the latency of our simultaneous lecture translation system should be the most important goal. We therefore have worked on several techniques for reducing the latency for both components, the automatic speech recognition and the speech translation module. Since the commonly used commitment latency is not appropriate in our case of continuous stream decoding, we focused on word latency. We used it to analyze the performance of our current system and to identify opportunities for improvements. In order to minimize the latency we combined run-on decoding with a technique for identifying stable partial hypotheses when stream decoding and a protocol for dynamic output update that allows to revise the most recent parts of the transcription. This combination reduces the latency at word level, where the words are final and will never be updated again in the future, from 18.1s to 1.1s without sacrificing performance in terms of word error rate.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源