迷你棒：实现变压器语言模型的灵活行为和代表性分析

论文标题

迷你棒：实现变压器语言模型的灵活行为和代表性分析

minicons: Enabling Flexible Behavioral and Representational Analyses of Transformer Language Models

论文作者

Misra, Kanishka

论文摘要

我们提出了Minicons，这是一个开源库，为有兴趣进行基于变压器语言模型（LMS）的行为和代表性分析的研究人员提供标准API。具体而言，Minicons使研究人员能够通过提供有效提取单词/句子级别概率的功能来应用两个级别的分析方法：（1）在预测级别上；（2）在代表性级别 - 还通过促进从一个或多个层中有效提取单词/短语级别向量的有效提取。在本文中，我们描述了图书馆并将其应用于两个激励的案例研究：一个侧重于BERT架构对相对语法判断的学习动力，另一个集中在基准的23个不同的LMS上，以零拍的绑架推理。 Minicons可从https://github.com/kanishkamisra/minicons获得

We present minicons, an open source library that provides a standard API for researchers interested in conducting behavioral and representational analyses of transformer-based language models (LMs). Specifically, minicons enables researchers to apply analysis methods at two levels: (1) at the prediction level -- by providing functions to efficiently extract word/sentence level probabilities; and (2) at the representational level -- by also facilitating efficient extraction of word/phrase level vectors from one or more layers. In this paper, we describe the library and apply it to two motivating case studies: One focusing on the learning dynamics of the BERT architecture on relative grammatical judgments, and the other on benchmarking 23 different LMs on zero-shot abductive reasoning. minicons is available at https://github.com/kanishkamisra/minicons

下载PDF全文

下载文献需遵守相关版权规定

论文标题