论文标题

MEE:一种新颖的多语言事件提取数据集

MEE: A Novel Multilingual Event Extraction Dataset

论文作者

Veyseh, Amir Pouran Ben, Ebrahimi, Javid, Dernoncourt, Franck, Nguyen, Thien Huu

论文摘要

事件提取(EE)是信息提取(IE)的基本任务之一,旨在从文本中识别事件提及及其论点(即参与者)。由于其重要性,已经开发了广泛的方法和资源来提取事件。但是,当前对EE的研究的局限性涉及非英语语言的探索案例,其中缺乏用于模型培训和评估的高质量多语言EE数据集是主要的障碍。为了解决这一限制,我们提出了一个新颖的多语言事件提取数据集(MEE),该数据集(MEE)为超过50K事件的注释提供了8种类型上不同的语言。 MEE全面注释实体提及,事件触发器和事件参数的数据。我们对拟议数据集进行了广泛的实验,以揭示多语言EE的挑战和机会。

Event Extraction (EE) is one of the fundamental tasks in Information Extraction (IE) that aims to recognize event mentions and their arguments (i.e., participants) from text. Due to its importance, extensive methods and resources have been developed for Event Extraction. However, one limitation of current research for EE involves the under-exploration for non-English languages in which the lack of high-quality multilingual EE datasets for model training and evaluation has been the main hindrance. To address this limitation, we propose a novel Multilingual Event Extraction dataset (MEE) that provides annotation for more than 50K event mentions in 8 typologically different languages. MEE comprehensively annotates data for entity mentions, event triggers and event arguments. We conduct extensive experiments on the proposed dataset to reveal challenges and opportunities for multilingual EE.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源