发现伯特学到的潜在概念

论文标题

发现伯特学到的潜在概念

Discovering Latent Concepts Learned in BERT

论文作者

Dalvi, Fahim, Khan, Abdul Rafae, Alam, Firoj, Durrani, Nadir, Xu, Jia, Sajjad, Hassan

论文摘要

大量研究分析了深层神经网络模型及其编码各种语言和非语言概念的能力，可以解释这些模型的内部力学。分析的范围仅限于预定义的概念，这些概念增强了传统的语言知识，并且不反思模型如何学到新颖概念。我们通过以无监督的方式发现和分析在神经网络模型中学到的潜在概念来解决这一限制，并从模型的角度提供解释。在这项工作中，我们研究：i）预先训练的BERT模型中存在哪些潜在概念，ii）发现的潜在概念与经典语言层次结构和III如何相结合或不同）iii）潜在概念如何跨层发展。我们的发现表明：i）一个模型学习新颖的概念（例如动物类别和人口统计组），这些概念并不严格遵守任何预定的分类（例如，pos，语义标签），ii）几个潜在概念基于多个属性，这些概念可能包括多个属性，这些属性可能包括较高的semantics，syntax和semers在模型中的较高层次，而在模型中较高的层次，而在模型中，则在模型中较高的层次，而在模型中，searlive seardiv nevely在模型中的范围较高的范围，而在模型中，则是模型的范围。发现的潜在概念突出了模型中学到的潜在偏见。我们还发布了一个新颖的BERT ConceptNet数据集（BCN），该数据集由174个概念标签和1M注释实例组成。

A large number of studies that analyze deep neural network models and their ability to encode various linguistic and non-linguistic concepts provide an interpretation of the inner mechanics of these models. The scope of the analyses is limited to pre-defined concepts that reinforce the traditional linguistic knowledge and do not reflect on how novel concepts are learned by the model. We address this limitation by discovering and analyzing latent concepts learned in neural network models in an unsupervised fashion and provide interpretations from the model's perspective. In this work, we study: i) what latent concepts exist in the pre-trained BERT model, ii) how the discovered latent concepts align or diverge from classical linguistic hierarchy and iii) how the latent concepts evolve across layers. Our findings show: i) a model learns novel concepts (e.g. animal categories and demographic groups), which do not strictly adhere to any pre-defined categorization (e.g. POS, semantic tags), ii) several latent concepts are based on multiple properties which may include semantics, syntax, and morphology, iii) the lower layers in the model dominate in learning shallow lexical concepts while the higher layers learn semantic relations and iv) the discovered latent concepts highlight potential biases learned in the model. We also release a novel BERT ConceptNet dataset (BCN) consisting of 174 concept labels and 1M annotated instances.

下载PDF全文

下载文献需遵守相关版权规定

论文标题