论文标题

不一致的效果有多有效?对代码混合讽刺检测的影响

How Effective is Incongruity? Implications for Code-mix Sarcasm Detection

论文作者

Shah, Aditya, Maurya, Chandresh Kumar

论文摘要

聊天机器人,Facebook,Twitter等社交媒体中讽刺的存在。下游NLP任务构成了一些挑战。这归因于以下事实:讽刺文本的预期含义与表达的内容背道而驰。此外,使用代码混合语言表达讽刺的是日益增加。当前用于代码混合数据的NLP技术由于使用了不同的词典,语法和标签语料库的稀缺性而取得了有限的成功。为了解决代码混合和讽刺检测的联合问题,我们提出了通过通过FastText学习的子字级嵌入捕获不一致性的想法。经验结果表明,我们提出的模型在代码混合的hinglish数据集上达到了F1得分,在训练10倍的训练速度和使用较低的内存范围时,可与预审计的多语言模型相媲美

The presence of sarcasm in conversational systems and social media like chatbots, Facebook, Twitter, etc. poses several challenges for downstream NLP tasks. This is attributed to the fact that the intended meaning of a sarcastic text is contrary to what is expressed. Further, the use of code-mix language to express sarcasm is increasing day by day. Current NLP techniques for code-mix data have limited success due to the use of different lexicon, syntax, and scarcity of labeled corpora. To solve the joint problem of code-mixing and sarcasm detection, we propose the idea of capturing incongruity through sub-word level embeddings learned via fastText. Empirical results shows that our proposed model achieves F1-score on code-mix Hinglish dataset comparable to pretrained multilingual models while training 10x faster and using a lower memory footprint

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源