组成，注意力还是两者兼而有之？

论文标题

组成，注意力还是两者兼而有之？

Composition, Attention, or Both?

论文作者

Yoshida, Ryo, Oseki, Yohei

论文摘要

在本文中，我们提出了一种称为组成注意力语法（CAGS）的新型结构，该结构将子树递归地构成具有组成函数的单个矢量表示，并选择性地使用具有自我注意力的机制的先前结构信息。我们研究了这些组成函数和自我发挥作用机制是否都可以诱导人类的句法概括。具体而言，我们有或没有这两个组件，训练语言模型（LMS），并仔细控制了模型大小，并评估其句法概括性能对语法基准上的六个测试电路。结果表明，组成函数和自我发项机制都起着重要的作用，可以使LMS变得更加人性化，并且对语言现象的仔细检查暗示，组成函数允许句法特征，但不能渗透到子树表示中。

In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components -- the composition function and the self-attention mechanism -- can both induce human-like syntactic generalization. Specifically, we train language models (LMs) with and without these two components with the model sizes carefully controlled, and evaluate their syntactic generalization performance against six test circuits on the SyntaxGym benchmark. The results demonstrated that the composition function and the self-attention mechanism both play an important role to make LMs more human-like, and closer inspection of linguistic phenomenon implied that the composition function allowed syntactic features, but not semantic features, to percolate into subtree representations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题