多模式学习仇恨模因检测

论文标题

多模式学习仇恨模因检测

Multimodal Learning for Hateful Memes Detection

论文作者

Zhou, Yi, Chen, Zhenhao

论文摘要

模因用于通过社交网络传播思想。尽管大多数模因都是为了幽默而创建的，但在图片和文字的结合下，有些模因变得仇恨。自动检测可恶的模因可以帮助减少其有害的社会影响。与传统的多模式任务不同，在语义上进行视觉和文本信息是对齐的，可恶模因检测的挑战在于其独特的多模式信息。模因中的图像和文字是弱对齐甚至无关紧要的，这要求模型理解内容并通过多种方式执行推理。在本文中，我们专注于多模式的仇恨模因检测，并提出了一种新颖的方法，该方法将图像字幕过程纳入模因检测过程。我们对多模式模因数据集进行了广泛的实验，并说明了我们方法的有效性。我们的模型在仇恨模因探测挑战上取得了令人鼓舞的结果。

Memes are used for spreading ideas through social networks. Although most memes are created for humor, some memes become hateful under the combination of pictures and text. Automatically detecting the hateful memes can help reduce their harmful social impact. Unlike the conventional multimodal tasks, where the visual and textual information is semantically aligned, the challenge of hateful memes detection lies in its unique multimodal information. The image and text in memes are weakly aligned or even irrelevant, which requires the model to understand the content and perform reasoning over multiple modalities. In this paper, we focus on multimodal hateful memes detection and propose a novel method that incorporates the image captioning process into the memes detection process. We conduct extensive experiments on multimodal meme datasets and illustrated the effectiveness of our approach. Our model achieves promising results on the Hateful Memes Detection Challenge.

下载PDF全文

下载文献需遵守相关版权规定

论文标题