论文标题
LSTM会看到性别吗?探测LSTM学习抽象句法规则的能力
Do LSTMs See Gender? Probing the Ability of LSTMs to Learn Abstract Syntactic Rules
论文作者
论文摘要
接受下一字预测训练的LSTM可以准确执行需要跟踪长距离句法依赖性的语言任务。值得注意的是,模型准确性在数字协议任务上迈进了人类的绩效(Gulordava等,2018)。但是,我们对LSTMS如何执行此类语言任务没有机械理解。 LSTMS是否学习抽象的语法规则,还是依靠简单的启发式方法?在这里,我们在法语中测试性别一致,该协议需要跟踪层次句法结构和词汇单位的固有性别。我们的模型能够可靠地预测在两个主题保护环境中的长距离性别一致性:名词 - 形容词和名词 - 论义动词协议。该模型与单一案例相比,该模型对具有性别吸引子的复数名词短语不准确,这表明依赖性别性别文章的线索有一致的一致性。总体而言,我们的研究强调了LSTM偏离人类行为的关键方式,并质疑LSTM是否真正学习抽象的句法规则和类别。我们建议使用性别一致作为一种有用的探索,以研究LSTM语言模型的基本机制,内部表示和语言能力。
LSTMs trained on next-word prediction can accurately perform linguistic tasks that require tracking long-distance syntactic dependencies. Notably, model accuracy approaches human performance on number agreement tasks (Gulordava et al., 2018). However, we do not have a mechanistic understanding of how LSTMs perform such linguistic tasks. Do LSTMs learn abstract grammatical rules, or do they rely on simple heuristics? Here, we test gender agreement in French which requires tracking both hierarchical syntactic structures and the inherent gender of lexical units. Our model is able to reliably predict long-distance gender agreement in two subject-predicate contexts: noun-adjective and noun-passive-verb agreement. The model showed more inaccuracies on plural noun phrases with gender attractors compared to singular cases, suggesting a reliance on clues from gendered articles for agreement. Overall, our study highlights key ways in which LSTMs deviate from human behaviour and questions whether LSTMs genuinely learn abstract syntactic rules and categories. We propose using gender agreement as a useful probe to investigate the underlying mechanisms, internal representations, and linguistic capabilities of LSTM language models.