论文标题

学习通过平滑的最大池来检测关键字零件和整个零件

Learning To Detect Keyword Parts And Whole By Smoothed Max Pooling

论文作者

Park, Hyun-Jin, Violette, Patrick, Subrahmanya, Niranjan

论文摘要

我们提出了平滑的最大池损失及其在关键字点斑点系统中的应用。所提出的方法共同训练编码器(检测关键字零件)和以半监督的方式进行解码器(以检测整个关键字)。提出的新损失功能允许训练模型检测部分和整个关键字,而无需严格取决于LVCSR的框架级标签(大词汇连续语音识别),从而使进一步的优化成为可能。由于优化性的提高,所提出的系统在[1]中优于基线关键字斑点模型。此外,由于对LVCSR的依赖性降低,因此可以更容易地适用于设备学习应用程序。

We propose smoothed max pooling loss and its application to keyword spotting systems. The proposed approach jointly trains an encoder (to detect keyword parts) and a decoder (to detect whole keyword) in a semi-supervised manner. The proposed new loss function allows training a model to detect parts and whole of a keyword, without strictly depending on frame-level labeling from LVCSR (Large vocabulary continuous speech recognition), making further optimization possible. The proposed system outperforms the baseline keyword spotting model in [1] due to increased optimizability. Further, it can be more easily adapted for on-device learning applications due to reduced dependency on LVCSR.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源