Chime-6挑战：针对未分段录音的多言式言语识别

论文标题

Chime-6挑战：针对未分段录音的多言式言语识别

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

论文作者

Watanabe, Shinji, Mandel, Michael, Barker, Jon, Vincent, Emmanuel, Arora, Ashish, Chang, Xuankai, Khudanpur, Sanjeev, Manohar, Vimal, Povey, Daniel, Raj, Desh, Snyder, David, Subramanian, Aswin Shanmugam, Trmal, Jan, Yair, Bar Ben, Boeddeker, Christoph, Ni, Zhaoheng, Fujita, Yusuke, Horiguchi, Shota, Kanda, Naoyuki, Yoshioka, Takuya, Ryant, Neville

论文摘要

在第一，第二，第三，第四和第五钟挑战的成功之后，我们组织了第六个钟声分离和识别挑战（Chime-6）。新的挑战重新讨论了先前的Chime-5挑战，并进一步考虑了遥远的多微粉对话语音诊断和在日常家庭环境中的认可问题。语音材料与以前的Chime-5录音相同，除了准确的阵列同步。该材料是使用晚宴场景引起的，并努力捕获代表自然对话演讲的数据。本文提供了针对分段的多孔语音识别（曲目1）和未分段的多孔语音识别（轨道2）的Chime-6挑战的基线描述。值得注意的是，Track 2是社区中的第一个挑战活动，即使用一套可再现的开源基线来应对未分段的多言式语音识别场景，从而提供语音增强，扬声器诊断和语音识别模块。

Following the success of the 1st, 2nd, 3rd, 4th and 5th CHiME challenges we organize the 6th CHiME Speech Separation and Recognition Challenge (CHiME-6). The new challenge revisits the previous CHiME-5 challenge and further considers the problem of distant multi-microphone conversational speech diarization and recognition in everyday home environments. Speech material is the same as the previous CHiME-5 recordings except for accurate array synchronization. The material was elicited using a dinner party scenario with efforts taken to capture data that is representative of natural conversational speech. This paper provides a baseline description of the CHiME-6 challenge for both segmented multispeaker speech recognition (Track 1) and unsegmented multispeaker speech recognition (Track 2). Of note, Track 2 is the first challenge activity in the community to tackle an unsegmented multispeaker speech recognition scenario with a complete set of reproducible open source baselines providing speech enhancement, speaker diarization, and speech recognition modules.

下载PDF全文

下载文献需遵守相关版权规定

论文标题