论文标题
从文本中检测真诚的问题:转移学习方法
Detecting Insincere Questions from Text: A Transfer Learning Approach
论文作者
论文摘要
当今的互联网已成为无与伦比的信息来源,人们在Quora,Reddit,Stackoverflow和Twitter等基于内容的网站上进行交谈,询问疑问并与世界共享知识。此类网站的一个主要问题是有毒评论或不诚实实例的扩散,在这种情况下,用户而不是保持真诚的动机沉迷于传播有毒和分裂的内容。面对这种情况的直接行动是事先检测到此类内容,并阻止其在线生存。近来,自然语言处理中的转移学习已经实现了前所未有的增长。如今,随着变压器和艺术创新的各种状态的存在,在各种NLP领域都取得了巨大的增长。 BERT的引入引起了NLP社区的轰动。如前所述,伯特(Bert)统治了性能基准,从而启发了许多其他作者对其进行试验并发布类似的模型。这导致了整个伯特家庭的发展,每个成员都专门从事另一个任务。在本文中,我们通过微调四个切割年龄模型,罗伯塔,德维特伯特和艾伯特来解决不真诚的问题分类问题。
The internet today has become an unrivalled source of information where people converse on content based websites such as Quora, Reddit, StackOverflow and Twitter asking doubts and sharing knowledge with the world. A major arising problem with such websites is the proliferation of toxic comments or instances of insincerity wherein the users instead of maintaining a sincere motive indulge in spreading toxic and divisive content. The straightforward course of action in confronting this situation is detecting such content beforehand and preventing it from subsisting online. In recent times Transfer Learning in Natural Language Processing has seen an unprecedented growth. Today with the existence of transformers and various state of the art innovations, a tremendous growth has been made in various NLP domains. The introduction of BERT has caused quite a stir in the NLP community. As mentioned, when published, BERT dominated performance benchmarks and thereby inspired many other authors to experiment with it and publish similar models. This led to the development of a whole BERT-family, each member being specialized on a different task. In this paper we solve the Insincere Questions Classification problem by fine tuning four cutting age models viz BERT, RoBERTa, DistilBERT and ALBERT.