依赖解析的混合方法：将规则和形态与深度学习相结合

论文标题

依赖解析的混合方法：将规则和形态与深度学习相结合

A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning

论文作者

Özateş, Şaziye Betül, Özgür, Arzucan, Güngör, Tunga, Öztürk, Balkız

论文摘要

完全由数据驱动的，基于深度学习的模型通常被设计为独立于语言，并且已被证明在许多自然语言处理任务方面取得了成功。但是，当研究的语言低资源并且培训数据的数量不足时，这些模型可以从基于自然语言的信息的整合中受益。我们提出了两种依赖解析的方法，尤其是对于具有限制培训数据的语言。我们的第一种方法将最先进的基于深度学习的解析器与基于规则的方法结合在一起，第二种方法将形态学信息纳入解析器。在基于规则的方法中，规则做出的解析决策被编码并与输入词的向量表示形式相连，作为与深网的附加信息。基于形态的方法提出了不同的方法，将单词的形态结构包括在解析器网络中。实验是在IMST-UD Treebank上进行的，结果表明，通过基于规则的解析系统，将目标语言的明确知识整合到神经解析器上，形态学分析会导致更准确的注释，从而提高分析性能在附件分析方面。提出的方法是针对土耳其语开发的，但也可以适用于其他语言。

Fully data-driven, deep learning-based models are usually designed as language-independent and have been shown to be successful for many natural language processing tasks. However, when the studied language is low-resourced and the amount of training data is insufficient, these models can benefit from the integration of natural language grammar-based information. We propose two approaches to dependency parsing especially for languages with restricted amount of training data. Our first approach combines a state-of-the-art deep learning-based parser with a rule-based approach and the second one incorporates morphological information into the parser. In the rule-based approach, the parsing decisions made by the rules are encoded and concatenated with the vector representations of the input words as additional information to the deep network. The morphology-based approach proposes different methods to include the morphological structure of words into the parser network. Experiments are conducted on the IMST-UD Treebank and the results suggest that integration of explicit knowledge about the target language to a neural parser through a rule-based parsing system and morphological analysis leads to more accurate annotations and hence, increases the parsing performance in terms of attachment scores. The proposed methods are developed for Turkish, but can be adapted to other languages as well.

下载PDF全文

下载文献需遵守相关版权规定

论文标题