论文标题
实时发言人诊断的时间顺序自我训练
Chronological Self-Training for Real-Time Speaker Diarization
论文作者
论文摘要
根据扬声器的声音,诊断将音频流划分为细分市场。包括入学步骤的实时诊断系统应限制入学培训样本,以减少用户互动时间。尽管对少数样品的培训产生的性能较差,但我们表明,使用年代自训练方法可以大大提高准确性。我们研究了训练时间和分类性能之间的权衡,发现1秒足以达到超过95%的精度。我们从6种不同的语言中评估了700个音频对话文件约10分钟,并证明平均诊断错误率低至10%。
Diarization partitions an audio stream into segments based on the voices of the speakers. Real-time diarization systems that include an enrollment step should limit enrollment training samples to reduce user interaction time. Although training on a small number of samples yields poor performance, we show that the accuracy can be improved dramatically using a chronological self-training approach. We studied the tradeoff between training time and classification performance and found that 1 second is sufficient to reach over 95% accuracy. We evaluated on 700 audio conversation files of about 10 minutes each from 6 different languages and demonstrated average diarization error rates as low as 10%.