论文标题
多种语言BERT模型中形态内容的系统分析
A Systematic Analysis of Morphological Content in BERT Models for Multiple Languages
论文作者
论文摘要
这项工作描述了实验,这些实验探测了形态含量的几种BERT风格模型的隐藏表示形式。目的是检查以形态特征和特征价值的形式检查离散语言结构的程度,在五种欧洲语言的培训预训练的语言模型的矢量表示和注意力分布中。本文包含的实验表明,(i)变压器体系结构在很大程度上将其嵌入空间划分为凸子区域高度与形态特征价值高度相关,(ii)变形金刚嵌入的上下文化性质允许模型在许多情况下,并非所有情况下都具有(iii II)的注意力/层组合,并在许多情况下区分歧义的形态形式。
This work describes experiments which probe the hidden representations of several BERT-style models for morphological content. The goal is to examine the extent to which discrete linguistic structure, in the form of morphological features and feature values, presents itself in the vector representations and attention distributions of pre-trained language models for five European languages. The experiments contained herein show that (i) Transformer architectures largely partition their embedding space into convex sub-regions highly correlated with morphological feature value, (ii) the contextualized nature of transformer embeddings allows models to distinguish ambiguous morphological forms in many, but not all cases, and (iii) very specific attention head/layer combinations appear to hone in on subject-verb agreement.