论文标题

covid-twitter-bert:一种自然语言处理模型,用于分析Twitter上的Covid-19内容

COVID-Twitter-BERT: A Natural Language Processing Model to Analyse COVID-19 Content on Twitter

论文作者

Müller, Martin, Salathé, Marcel, Kummervold, Per E

论文摘要

在这项工作中,我们发布了基于变压器的模型Covid-twitter-Bert(CT-Bert),在Covid-19主题的大量Twitter消息上预处理。与五个不同分类数据集上的基本模型Bert-Large相比,我们的模型显示了10-30%的边际改进。最大的改进是目标域。经过验证的变压器模型(例如CT-Bert)在特定的目标域进行了培训,可用于各种自然语言处理任务,包括分类,提问和聊天机器人。 CT-bert被优化可用于COVID-19内容,特别是Twitter的社交媒体帖子。

In this work, we release COVID-Twitter-BERT (CT-BERT), a transformer-based model, pretrained on a large corpus of Twitter messages on the topic of COVID-19. Our model shows a 10-30% marginal improvement compared to its base model, BERT-Large, on five different classification datasets. The largest improvements are on the target domain. Pretrained transformer models, such as CT-BERT, are trained on a specific target domain and can be used for a wide variety of natural language processing tasks, including classification, question-answering and chatbots. CT-BERT is optimised to be used on COVID-19 content, in particular social media posts from Twitter.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源