论文标题

多模式情感识别的多级变压器

Multilevel Transformer For Multimodal Emotion Recognition

论文作者

He, Junyi, Wu, Meimei, Li, Meng, Zhu, Xiaobo, Ye, Feng

论文摘要

多模式情绪识别最近引起了很多关注。有效地将多种模式与有限的标记数据融合是一项具有挑战性的任务。考虑到预先训练的模型的成功和情绪表达的精细元素性质,考虑到这两个方面是合理的。与以前主要关注一个方面的方法不同,我们引入了一种新型的多晶格框架,该框架结合了细粒度的表示与预训练的话语级表示。受到变压器TTS的启发,我们提出了一个多级变压器模型,以执行细粒度的多模式识别。具体而言,我们探索了将音素级嵌入与单词级嵌入结合的不同方法。为了进行多粒性学习,我们只将多级变压器模型与阿尔伯特相结合。广泛的实验结果表明,我们的多级变压器模型和多粒度模型都超过了具有文本成绩单和语音信号的Iemocap数据集上先前的最新方法。

Multimodal emotion recognition has attracted much attention recently. Fusing multiple modalities effectively with limited labeled data is a challenging task. Considering the success of pre-trained model and fine-grained nature of emotion expression, it is reasonable to take these two aspects into consideration. Unlike previous methods that mainly focus on one aspect, we introduce a novel multi-granularity framework, which combines fine-grained representation with pre-trained utterance-level representation. Inspired by Transformer TTS, we propose a multilevel transformer model to perform fine-grained multimodal emotion recognition. Specifically, we explore different methods to incorporate phoneme-level embedding with word-level embedding. To perform multi-granularity learning, we simply combine multilevel transformer model with Albert. Extensive experimental results show that both our multilevel transformer model and multi-granularity model outperform previous state-of-the-art approaches on IEMOCAP dataset with text transcripts and speech signal.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源