论文标题

对话性语音分离:用于流应用的评估研究

Conversational Speech Separation: an Evaluation Study for Streaming Applications

论文作者

Morrone, Giovanni, Cornell, Samuele, Zovato, Enrico, Brutti, Alessio, Squartini, Stefano

论文摘要

连续的语音分离(CSS)是一个最近提出的框架,旨在以流媒体方式将每个说话者与输入混合物信号分开。此后,我们对CSS系统的实践设计注意事项进行了评估研究,以解决最近工作中忽略的重要方面。特别是,我们专注于分离性能,计算要求和输出潜伏期之间的权衡,以表明如何使用离线分离算法来执行具有所需延迟的CSS。我们对CSS处理窗口尺寸的选择和稀疏重叠数据的跳跃大小进行了广泛的分析。我们发现,对于5 s的窗口,可以获得计算负担和性能之间的最佳权衡。

Continuous speech separation (CSS) is a recently proposed framework which aims at separating each speaker from an input mixture signal in a streaming fashion. Hereafter we perform an evaluation study on practical design considerations for a CSS system, addressing important aspects which have been neglected in recent works. In particular, we focus on the trade-off between separation performance, computational requirements and output latency showing how an offline separation algorithm can be used to perform CSS with a desired latency. We carry out an extensive analysis on the choice of CSS processing window size and hop size on sparsely overlapped data. We find out that the best trade-off between computational burden and performance is obtained for a window of 5 s.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源