使用封闭卷积神经网络检测痴呆症的语音副语言方法

论文标题

使用封闭卷积神经网络检测痴呆症的语音副语言方法

Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network

论文作者

Makiuchi, Mariana Rodrigues, Warnita, Tifani, Inoue, Nakamasa, Shinoda, Koichi, Yoshimura, Michitaka, Kitazawa, Momoko, Funaki, Kei, Eguchi, Yoko, Kishimoto, Taishiro

论文摘要

我们提出了一种非侵入性且具有成本效益的方法，可以通过仅利用语音音频数据来自动检测痴呆症。我们为短语音段提取副语言特征，并使用门控卷积神经网络（GCNN）将其分类为痴呆或健康。我们在Pitt语料库和我们自己的数据集（提示数据库）上评估我们的方法。我们的方法使用平均114秒的语音数据在Pitt语料库上的准确性为73.1％。在及时的数据库中，我们的方法使用4秒钟的语音数据产生了74.7％的准确性，当我们使用所有患者的语音数据时，它的准确性将提高到80.8％。此外，我们在三类分类问题上评估了我们的方法，其中包括轻度的认知障碍（MCI）类，并在40秒的语音数据中达到了60.6％的准确性。

We propose a non-invasive and cost-effective method to automatically detect dementia by utilizing solely speech audio data. We extract paralinguistic features for a short speech segment and use Gated Convolutional Neural Networks (GCNN) to classify it into dementia or healthy. We evaluate our method on the Pitt Corpus and on our own dataset, the PROMPT Database. Our method yields the accuracy of 73.1% on the Pitt Corpus using an average of 114 seconds of speech data. In the PROMPT Database, our method yields the accuracy of 74.7% using 4 seconds of speech data and it improves to 80.8% when we use all the patient's speech data. Furthermore, we evaluate our method on a three-class classification problem in which we included the Mild Cognitive Impairment (MCI) class and achieved the accuracy of 60.6% with 40 seconds of speech data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题