论文标题
希腊戏剧中语音情感识别的数据集
A Dataset for Speech Emotion Recognition in Greek Theatrical Plays
论文作者
论文摘要
可以在文化应用中采用机器学习方法,并提出新的方法来分发甚至向公众展示文化内容。例如,可以采用语音分析来自动在戏剧中产生字幕,以帮助(除其他目的)帮助听力损失的人。除了具有自动语音识别(ASR)的典型语音到文本转录外,可以使用语音情感识别(SER)来自动预测戏剧中语音对话的潜在情感内容,从而提供更深入的理解,以提供更深入的理解。但是,文献中没有戏剧中的现实世界数据集。在这项工作中,我们介绍了希腊戏剧情感数据集Grethe,这是一种新的公开数据收集,用于希腊戏剧中的语音情感识别。数据集包含来自各种演员和戏剧的话语,以及各自的价和唤醒注释。为此,已要求多个注释者在最终的地面真相生成中考虑到每个语音记录和通知者协议的意见。此外,我们讨论了使用数据集使用机器和深度学习框架进行的一些指示性实验的结果,以及语音情感识别领域中一些广泛使用的数据库。
Machine learning methodologies can be adopted in cultural applications and propose new ways to distribute or even present the cultural content to the public. For instance, speech analytics can be adopted to automatically generate subtitles in theatrical plays, in order to (among other purposes) help people with hearing loss. Apart from a typical speech-to-text transcription with Automatic Speech Recognition (ASR), Speech Emotion Recognition (SER) can be used to automatically predict the underlying emotional content of speech dialogues in theatrical plays, and thus to provide a deeper understanding how the actors utter their lines. However, real-world datasets from theatrical plays are not available in the literature. In this work we present GreThE, the Greek Theatrical Emotion dataset, a new publicly available data collection for speech emotion recognition in Greek theatrical plays. The dataset contains utterances from various actors and plays, along with respective valence and arousal annotations. Towards this end, multiple annotators have been asked to provide their input for each speech recording and inter-annotator agreement is taken into account in the final ground truth generation. In addition, we discuss the results of some indicative experiments that have been conducted with machine and deep learning frameworks, using the dataset, along with some widely used databases in the field of speech emotion recognition.