论文标题
M键入:在对话中识别情感的多语言多方数据集
M-MELD: A Multilingual Multi-Party Dataset for Emotion Recognition in Conversations
论文作者
论文摘要
表达情绪是日常人类交流的关键部分。对话(ERC)中的情感识别是一个新兴的研究领域,主要任务是确定对话中每种话语背后的情感。尽管过去对ERC进行了许多工作,但这些作品仅专注于英语的ERC,从而忽略了其他任何语言。在本文中,我们将多语言键合(M-MELD)介绍,在其中扩展了多模式的情感数据集(MELD)\ cite {poria2018-meld}到英语以外的其他4种语言,即希腊语,波兰语,法语和西班牙语。除了为所有这四种语言建立强大的基线外,我们还提出了一种新颖的体系结构Diblestm,该建筑在ERC的对话对话中同时使用顺序和对话性话语上下文。我们提出的方法是计算上的效率,可以仅使用跨语性编码器在语言上传输,并且比MELD和M-MELD的文献中的大多数单单模式文本方法都能实现更好的性能。我们在GitHub上公开制作数据和代码。
Expression of emotions is a crucial part of daily human communication. Emotion recognition in conversations (ERC) is an emerging field of study, where the primary task is to identify the emotion behind each utterance in a conversation. Though a lot of work has been done on ERC in the past, these works only focus on ERC in the English language, thereby ignoring any other languages. In this paper, we present Multilingual MELD (M-MELD), where we extend the Multimodal EmotionLines Dataset (MELD) \cite{poria2018meld} to 4 other languages beyond English, namely Greek, Polish, French, and Spanish. Beyond just establishing strong baselines for all of these 4 languages, we also propose a novel architecture, DiscLSTM, that uses both sequential and conversational discourse context in a conversational dialogue for ERC. Our proposed approach is computationally efficient, can transfer across languages using just a cross-lingual encoder, and achieves better performance than most uni-modal text approaches in the literature on both MELD and M-MELD. We make our data and code publicly on GitHub.