论文标题

迷你棒:实现变压器语言模型的灵活行为和代表性分析

minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models

论文作者

Misra, Kanishka

论文摘要

我们提出了Minicons,这是一个开源库,为有兴趣进行基于变压器语言模型(LMS)的行为和代表性分析的研究人员提供标准API。具体而言,Minicons使研究人员能够通过提供有效提取单词/句子级别概率的功能来应用两个级别的分析方法:(1)在预测级别上; (2)在代表性级别 - 还通过促进从一个或多个层中有效提取单词/短语级别向量的有效提取。在本文中,我们描述了图书馆并将其应用于两个激励的案例研究:一个侧重于BERT架构对相对语法判断的学习动力,另一个集中在基准的23个不同的LMS上,以零拍的绑架推理。 Minicons可从https://github.com/kanishkamisra/minicons获得

We present minicons, an open source library that provides a standard API for researchers interested in conducting behavioral and representational analyses of transformer-based language models (LMs). Specifically, minicons enables researchers to apply analysis methods at two levels: (1) at the prediction level -- by providing functions to efficiently extract word/sentence level probabilities; and (2) at the representational level -- by also facilitating efficient extraction of word/phrase level vectors from one or more layers. In this paper, we describe the library and apply it to two motivating case studies: One focusing on the learning dynamics of the BERT architecture on relative grammatical judgments, and the other on benchmarking 23 different LMs on zero-shot abductive reasoning. minicons is available at https://github.com/kanishkamisra/minicons

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源