论文标题

弗雷达:灵活的关系提取数据注释

FREDA: Flexible Relation Extraction Data Annotation

论文作者

Strobl, Michael, Trabelsi, Amine, Zaiane, Osmar

论文摘要

为了有效训练准确的关系提取模型,需要足够且正确标记的数据。足够标记的数据很难获得,注释此类数据是一项棘手的事业。先前的工作表明,如果准确地完成,必须牺牲精度,或者任务非常耗时。我们正在提出一种方法,以生产高质量的数据集,以快速提取关系提取。神经模型,经过培训,可以在创建的数据集上进行关系提取,取得了很好的结果并将其推广到其他数据集。在我们的研究中,我们能够在合理的时间内注释19个关系的10,022个句子,并为每个关系培训了一个常用的基线模型。

To effectively train accurate Relation Extraction models, sufficient and properly labeled data is required. Adequately labeled data is difficult to obtain and annotating such data is a tricky undertaking. Previous works have shown that either accuracy has to be sacrificed or the task is extremely time-consuming, if done accurately. We are proposing an approach in order to produce high-quality datasets for the task of Relation Extraction quickly. Neural models, trained to do Relation Extraction on the created datasets, achieve very good results and generalize well to other datasets. In our study, we were able to annotate 10,022 sentences for 19 relations in a reasonable amount of time, and trained a commonly used baseline model for each relation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源