论文标题

学习提问回答的经济有效的注释政策

Learning a Cost-Effective Annotation Policy for Question Answering

论文作者

Kratzwald, Bernhard, Feuerriegel, Stefan, Sun, Huan

论文摘要

最先进的问答(QA)取决于大量的培训数据,这些数据的标签耗时且因此昂贵。因此,自定义质量检查系统具有挑战性。作为一种补救措施,我们提出了一个新颖的框架,用于注释QA数据集,以学习成本有效的注释政策和半监督注释方案。后者减少了人类的努力:它利用基本的质量检查系统建议潜在的候选注释。然后,人类注释者只是对这些候选人提供二进制反馈。我们的系统的设计使过去的注释不断提高未来的性能,从而总体注释成本。据我们所知,这是第一份以最低的注释成本解决注释问题的问题。我们将框架与广泛的实验中的传统手动注释进行了比较。我们发现我们的方法可以降低注释成本的21.1%。

State-of-the-art question answering (QA) relies upon large amounts of training data for which labeling is time consuming and thus expensive. For this reason, customizing QA systems is challenging. As a remedy, we propose a novel framework for annotating QA datasets that entails learning a cost-effective annotation policy and a semi-supervised annotation scheme. The latter reduces the human effort: it leverages the underlying QA system to suggest potential candidate annotations. Human annotators then simply provide binary feedback on these candidates. Our system is designed such that past annotations continuously improve the future performance and thus overall annotation cost. To the best of our knowledge, this is the first paper to address the problem of annotating questions with minimal annotation cost. We compare our framework against traditional manual annotations in an extensive set of experiments. We find that our approach can reduce up to 21.1% of the annotation cost.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源