论文标题

影响在LSTM语言模型中表征主题 - 动词数量一致的路径

Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models

论文作者

Lu, Kaiji, Mardziel, Piotr, Leino, Klas, Fedrikson, Matt, Datta, Anupam

论文摘要

基于LSTM的复发神经网络是许多自然语言处理(NLP)任务的最新技术。尽管表现出色,但尚不清楚LSTMS是否或如何学习自然语言的结构特征,例如主题 - 动词数量一致。缺乏这种理解,LSTM在这项任务上的一般性及其对相关任务的适用性仍然不确定。此外,错误不能正确地归因于缺乏结构能力,训练数据遗漏或其他特殊故障。我们介绍 *影响路径 *,这是一个因子跨跨门神经网络的大门和神经元所带来的结构特性的因果报道。该方法将影响的概念(受试者的语法数具有影响后续动词的语法数字)为一组栅极或神经元级路径。该集合将概念(例如主题 - 动词协议),其组成元素(例如主题)及相关或干扰元素(例如吸引者)进行定位和段。我们在广泛研究的多层LSTM语言模型上体现了该方法,这证明了其对主题 - 动力数量一致性的影响。与基于诊断分类器和消融的先前结果相比,该结果对LSTM处理这一结构性的处理提供了更好的和更完整的视图。

LSTM-based recurrent neural networks are the state-of-the-art for many natural language processing (NLP) tasks. Despite their performance, it is unclear whether, or how, LSTMs learn structural features of natural languages such as subject-verb number agreement in English. Lacking this understanding, the generality of LSTM performance on this task and their suitability for related tasks remains uncertain. Further, errors cannot be properly attributed to a lack of structural capability, training data omissions, or other exceptional faults. We introduce *influence paths*, a causal account of structural properties as carried by paths across gates and neurons of a recurrent neural network. The approach refines the notion of influence (the subject's grammatical number has influence on the grammatical number of the subsequent verb) into a set of gate or neuron-level paths. The set localizes and segments the concept (e.g., subject-verb agreement), its constituent elements (e.g., the subject), and related or interfering elements (e.g., attractors). We exemplify the methodology on a widely-studied multi-layer LSTM language model, demonstrating its accounting for subject-verb number agreement. The results offer both a finer and a more complete view of an LSTM's handling of this structural aspect of the English language than prior results based on diagnostic classifiers and ablation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源