论文标题
迷你棒:实现变压器语言模型的灵活行为和代表性分析
minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models
论文作者
论文摘要
我们提出了Minicons,这是一个开源库,为有兴趣进行基于变压器语言模型(LMS)的行为和代表性分析的研究人员提供标准API。具体而言,Minicons使研究人员能够通过提供有效提取单词/句子级别概率的功能来应用两个级别的分析方法:(1)在预测级别上; (2)在代表性级别 - 还通过促进从一个或多个层中有效提取单词/短语级别向量的有效提取。在本文中,我们描述了图书馆并将其应用于两个激励的案例研究:一个侧重于BERT架构对相对语法判断的学习动力,另一个集中在基准的23个不同的LMS上,以零拍的绑架推理。 Minicons可从https://github.com/kanishkamisra/minicons获得
We present minicons, an open source library that provides a standard API for researchers interested in conducting behavioral and representational analyses of transformer-based language models (LMs). Specifically, minicons enables researchers to apply analysis methods at two levels: (1) at the prediction level -- by providing functions to efficiently extract word/sentence level probabilities; and (2) at the representational level -- by also facilitating efficient extraction of word/phrase level vectors from one or more layers. In this paper, we describe the library and apply it to two motivating case studies: One focusing on the learning dynamics of the BERT architecture on relative grammatical judgments, and the other on benchmarking 23 different LMs on zero-shot abductive reasoning. minicons is available at https://github.com/kanishkamisra/minicons