论文标题
关于在社交媒体中表达情绪的图像和文本的互补性
On the Complementarity of Images and Text for the Expression of Emotions in Social Media
论文作者
论文摘要
社交媒体中帖子的作者传达了他们的情绪,以及导致他们的文字和图像的原因。尽管分别对每种方式进行了情感和刺激检测的工作,但尚不清楚社交媒体中是否包含互补的情感信息。我们旨在填补这一研究差距,并为英语多模式REDDIT帖子提供小说,带注释的语料库。在此资源上,我们开发了模型,以自动检测图像和文本之间的关系,情感刺激类别和情感类别。我们评估了这些任务是否需要模式并找到图像文本关系,单独的文本对于大多数类别(互补,说明性,反对)就足够了:文本中的信息允许预测是否需要图像来理解情感。最好通过多模式模型来预测愤怒和悲伤的情绪,而单独的文字足以使人感到厌恶,喜悦和惊喜。用物体,动物,食物或人描绘的刺激是仅由图像模型预测的,而多模型模型对艺术,事件,模因,地点或屏幕截图最有效。
Authors of posts in social media communicate their emotions and what causes them with text and images. While there is work on emotion and stimulus detection for each modality separately, it is yet unknown if the modalities contain complementary emotion information in social media. We aim at filling this research gap and contribute a novel, annotated corpus of English multimodal Reddit posts. On this resource, we develop models to automatically detect the relation between image and text, an emotion stimulus category and the emotion class. We evaluate if these tasks require both modalities and find for the image-text relations, that text alone is sufficient for most categories (complementary, illustrative, opposing): the information in the text allows to predict if an image is required for emotion understanding. The emotions of anger and sadness are best predicted with a multimodal model, while text alone is sufficient for disgust, joy, and surprise. Stimuli depicted by objects, animals, food, or a person are best predicted by image-only models, while multimodal models are most effective on art, events, memes, places, or screenshots.