分析明确的负面示例的实用性，以提高神经语言模型的句法能力

论文标题

分析明确的负面示例的实用性，以提高神经语言模型的句法能力

An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models

论文作者

Noji, Hiroshi, Takamura, Hiroya

论文摘要

我们探讨了训练神经语言模型中明确的负面例子的实用性。句子中的否定示例是句子中的不正确单词，例如“*狗吠叫”中的“吠叫”。神经语言模型通常仅在积极的例子上训练，这是培训数据中的一组句子，但是最近的研究表明，以这种方式培训的模型无法强牢固地处理复杂的句法结构，例如长距离一致。在本文中，使用英语数据，我们首先证明，适当地使用有关特定构造的负面示例（例如，主题 - 动词协议）将增强模型对它们的鲁棒性，并且具有无可忽视的困惑性。我们成功的关键是正确单词的对数类似物与不正确的单词之间的额外差额损失。然后，我们对训练有素的模型进行详细分析。我们的发现之一是RNN的对象相关子句的困难。我们发现，即使我们的直接学习信号，模型仍然遭受了跨对象相关条款的共识。增加涉及建筑的训练句子有些有帮助，但是准确性仍然没有达到主题相关条款的水平。尽管不是直接的认知吸引力，但我们的方法可以是分析神经模型对挑战语言结构的真正建筑限制的工具。

We explore the utilities of explicit negative examples in training neural language models. Negative examples here are incorrect words in a sentence, such as "barks" in "*The dogs barks". Neural language models are commonly trained only on positive examples, a set of sentences in the training data, but recent studies suggest that the models trained in this way are not capable of robustly handling complex syntactic constructions, such as long-distance agreement. In this paper, using English data, we first demonstrate that appropriately using negative examples about particular constructions (e.g., subject-verb agreement) will boost the model's robustness on them, with a negligible loss of perplexity. The key to our success is an additional margin loss between the log-likelihoods of a correct word and an incorrect word. We then provide a detailed analysis of the trained models. One of our findings is the difficulty of object-relative clauses for RNNs. We find that even with our direct learning signals the models still suffer from resolving agreement across an object-relative clause. Augmentation of training sentences involving the constructions somewhat helps, but the accuracy still does not reach the level of subject-relative clauses. Although not directly cognitively appealing, our method can be a tool to analyze the true architectural limitation of neural models on challenging linguistic constructions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题