论文标题

跨语性的普遍依赖性仅从一个单语言库中解析

Cross-lingual Universal Dependency Parsing Only from One Monolingual Treebank

论文作者

Sun, Kailai, Li, Zuchao, Zhao, Hai

论文摘要

句法解析是一项高度语言的处理任务,其解析器需要从昂贵的人类注释中对树仓进行培训。由于不太可能为每种人类语言获得树库,因此在这项工作中,我们提出了一个有效的跨语言UD解析框架,用于将解析器从一个源单语言库转移到没有Treebank的任何其他目标语言中。为了在完全不同的语言中达到令人满意的解析准确性,我们将两个语言建模任务引入依赖性解析为多任务。假设只有来自目标语言的未标记数据加上来源树库可以一起利用,我们采用自我培训策略来进一步改善我们的多任务框架。我们提出的跨语言解析器是针对英语,中文和22个UD Treebanks实施的。这项实证研究表明,我们的跨语性解析器首次为所有目标语言产生有希望的结果,以接近在其自己的目标树库中训练的解析器性能。

Syntactic parsing is a highly linguistic processing task whose parser requires training on treebanks from the expensive human annotation. As it is unlikely to obtain a treebank for every human language, in this work, we propose an effective cross-lingual UD parsing framework for transferring parser from only one source monolingual treebank to any other target languages without treebank available. To reach satisfactory parsing accuracy among quite different languages, we introduce two language modeling tasks into dependency parsing as multi-tasking. Assuming only unlabeled data from target languages plus the source treebank can be exploited together, we adopt a self-training strategy for further performance improvement in terms of our multi-task framework. Our proposed cross-lingual parsers are implemented for English, Chinese, and 22 UD treebanks. The empirical study shows that our cross-lingual parsers yield promising results for all target languages, for the first time, approaching the parser performance which is trained in its own target treebank.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源