论文标题
在实时超声视频中自动跟踪舌头表面的深度学习,地标,而不是轮廓
Deep Learning for Automatic Tracking of Tongue Surface in Real-time Ultrasound Videos, Landmarks instead of Contours
论文作者
论文摘要
医学超声成像的一种用法是在实时演讲中可视化和表征人的舌头形状和运动,以研究健康或受损的语音产生。由于超声图像的低对抗性特征和嘈杂的性质,因此非专家用户可能需要专业知识来识别应用程序中的舌头手势,例如第二语言的视觉训练。此外,对舌头运动的定量分析需要舌头背轮廓要提取,跟踪和可视化。手动舌头轮廓提取是一项笨拙,主观和容易出错的任务。此外,它不是实时应用程序的可行解决方案。深度学习的增长已在各种计算机视觉任务中大力利用,包括超声舌轮廓跟踪。在当前的方法中,舌头提取的过程包括图像分割和后处理的两个步骤。本文介绍了一种使用深神经网络的自动和实时舌轮廓跟踪的新型新颖方法。在提出的方法中,跟踪舌头表面的地标,而不是两步的过程。这个新颖的想法使研究人员可以从可用的先前注释的数据库中获得好处,从而获得高精度的结果。我们的实验在概括,性能和准确性方面揭示了该技术的出色性能。
One usage of medical ultrasound imaging is to visualize and characterize human tongue shape and motion during a real-time speech to study healthy or impaired speech production. Due to the low-contrast characteristic and noisy nature of ultrasound images, it might require expertise for non-expert users to recognize tongue gestures in applications such as visual training of a second language. Moreover, quantitative analysis of tongue motion needs the tongue dorsum contour to be extracted, tracked, and visualized. Manual tongue contour extraction is a cumbersome, subjective, and error-prone task. Furthermore, it is not a feasible solution for real-time applications. The growth of deep learning has been vigorously exploited in various computer vision tasks, including ultrasound tongue contour tracking. In the current methods, the process of tongue contour extraction comprises two steps of image segmentation and post-processing. This paper presents a new novel approach of automatic and real-time tongue contour tracking using deep neural networks. In the proposed method, instead of the two-step procedure, landmarks of the tongue surface are tracked. This novel idea enables researchers in this filed to benefits from available previously annotated databases to achieve high accuracy results. Our experiment disclosed the outstanding performances of the proposed technique in terms of generalization, performance, and accuracy.