监督对单词嵌入的理解

论文标题

监督对单词嵌入的理解

Supervised Understanding of Word Embeddings

论文作者

Yerebakan, Halid Ziya, Bhatia, Parmeet, Shinagawa, Yoshihisa

论文摘要

预训练的单词嵌入被广泛用于自然语言处理中的转移学习。嵌入是在紧凑的欧几里得空间中保留其相似性的单词的连续和分布式表示。但是，这些空间的尺寸没有提供任何明确的解释。在这项研究中，我们以单词嵌入式的线性关键字级分类器的形式获得了监督的预测。我们已经证明该方法可以创建原始嵌入维度的可解释的投影。训练有素的分类器节点的激活对应于词汇中单词的一个子集。因此，它们的行为与字典特征相似，同时具有连续价值输出的优点。此外，通过在关键字的初始集合中添加最高得分单词的专家标签，可以在多个回合中迭代地种植此类词典。同样，相同的分类器可以应用于其他语言的对齐单词嵌入，以获取相应的词典。在我们的实验中，我们表明，用这些分类器权重初始化高阶网络为下游NLP任务提供了更准确的模型。我们进一步证明了受监督维度在揭示关键字的多义性质中的有用性，通过预测其嵌入在不同子空间中的学习分类器的嵌入。

Pre-trained word embeddings are widely used for transfer learning in natural language processing. The embeddings are continuous and distributed representations of the words that preserve their similarities in compact Euclidean spaces. However, the dimensions of these spaces do not provide any clear interpretation. In this study, we have obtained supervised projections in the form of the linear keyword-level classifiers on word embeddings. We have shown that the method creates interpretable projections of original embedding dimensions. Activations of the trained classifier nodes correspond to a subset of the words in the vocabulary. Thus, they behave similarly to the dictionary features while having the merit of continuous value output. Additionally, such dictionaries can be grown iteratively with multiple rounds by adding expert labels on top-scoring words to an initial collection of the keywords. Also, the same classifiers can be applied to aligned word embeddings in other languages to obtain corresponding dictionaries. In our experiments, we have shown that initializing higher-order networks with these classifier weights gives more accurate models for downstream NLP tasks. We further demonstrate the usefulness of supervised dimensions in revealing the polysemous nature of a keyword of interest by projecting it's embedding using learned classifiers in different sub-spaces.

下载PDF全文

下载文献需遵守相关版权规定

论文标题