论文标题

通过概念化解释嵌入空间

Interpreting Embedding Spaces by Conceptualization

论文作者

Simhi, Adi, Markovitch, Shaul

论文摘要

文本计算解释的主要方法之一是将其映射到某些嵌入空间中的向量。然后可以将这些向量用于各种文本处理任务。最近,大多数嵌入空间是培训大语言模型(LLM)的产物。这种类型的表示的一个主要缺点是它们对人类的不可理解。了解嵌入空间对于几种重要需求至关重要,包括需要调试嵌入方法并将其与替代方案进行比较,以及检测模型中隐藏的偏见的需求。在本文中,我们通过将潜在的嵌入空间转换为可理解的概念空间来介绍一种新颖的理解嵌入方法。我们提出了一种算法,用于传达具有动态按需粒度的概念空间。我们使用基于人类的评估者或基于LLM的评估者设计了一种新的评估方法,以表明概念化的向量确实代表了原始潜伏的语义。我们显示了我们在各种任务中使用方法的使用,包括比较替代模型的语义和追踪LLM的层。该代码可在线获得https://github.com/adisimhi/interpreting-embedding-spaces-by-conceptualization。

One of the main methods for computational interpretation of a text is mapping it into a vector in some embedding space. Such vectors can then be used for a variety of textual processing tasks. Recently, most embedding spaces are a product of training large language models (LLMs). One major drawback of this type of representation is their incomprehensibility to humans. Understanding the embedding space is crucial for several important needs, including the need to debug the embedding method and compare it to alternatives, and the need to detect biases hidden in the model. In this paper, we present a novel method of understanding embeddings by transforming a latent embedding space into a comprehensible conceptual space. We present an algorithm for deriving a conceptual space with dynamic on-demand granularity. We devise a new evaluation method, using either human rater or LLM-based raters, to show that the conceptualized vectors indeed represent the semantics of the original latent ones. We show the use of our method for various tasks, including comparing the semantics of alternative models and tracing the layers of the LLM. The code is available online https://github.com/adiSimhi/Interpreting-Embedding-Spaces-by-Conceptualization.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源