论文标题

IITD在WANLP 2022共享任务上

IITD at the WANLP 2022 Shared Task: Multilingual Multi-Granularity Network for Propaganda Detection

论文作者

Mittal, Shubham, Nakov, Preslav

论文摘要

我们介绍了有关阿拉伯语的宣传检测的共同任务的两个子任务的系统,这是WANLP'2022的一部分。子任务1是一个多标签分类问题,可以找到给定推文中使用的宣传技术。我们执行此任务的系统使用XLM-R来预测目标推文使用每种技术的概率。除了找到技术外,子任务2还要求确定推文中每个技术的每个实例的文本跨度。该任务可以建模为序列标记问题。我们使用Mbert编码器进行子任务2的多粒性网络。总体而言,我们的系统在两个子任务中排名第二(分别为14和3个参与者中的参与者中的第二个)。我们的经验分析表明,无论是在英语中使用还是在翻译为阿拉伯语后,使用宣传技术注释的更大的英语语料库都无济于事。

We present our system for the two subtasks of the shared task on propaganda detection in Arabic, part of WANLP'2022. Subtask 1 is a multi-label classification problem to find the propaganda techniques used in a given tweet. Our system for this task uses XLM-R to predict probabilities for the target tweet to use each of the techniques. In addition to finding the techniques, Subtask 2 further asks to identify the textual span for each instance of each technique that is present in the tweet; the task can be modeled as a sequence tagging problem. We use a multi-granularity network with mBERT encoder for Subtask 2. Overall, our system ranks second for both subtasks (out of 14 and 3 participants, respectively). Our empirical analysis show that it does not help to use a much larger English corpus annotated with propaganda techniques, regardless of whether used in English or after translation to Arabic.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源