论文标题
多个单词嵌入以增加表示的多样性
Multiple Word Embeddings for Increased Diversity of Representation
论文作者
论文摘要
自然语言处理(NLP)中的大多数最先进的模型是建立在大型,预训练的上下文语言模型之上的神经模型,这些模型在上下文中生成单词表示,并且对手头的任务进行了微调。这些“上下文嵌入”提供的改进带有高计算成本。在这项工作中,我们探索了一种简单的技术,该技术基本的基线可以实质上,一致地提高性能,而运行时间可以忽略不计。我们将多个预训练的嵌入串联以加强我们对单词的表示。我们表明,这种串联技术在许多任务,数据集和模型类型中都起作用。我们分析了预训练的嵌入相似性和词汇覆盖的各个方面,并发现不同预训练的嵌入之间的代表性多样性是该技术为什么起作用的推动力。我们在Tensorflow和Pytorch中提供了模型的开源实现。
Most state-of-the-art models in natural language processing (NLP) are neural models built on top of large, pre-trained, contextual language models that generate representations of words in context and are fine-tuned for the task at hand. The improvements afforded by these "contextual embeddings" come with a high computational cost. In this work, we explore a simple technique that substantially and consistently improves performance over a strong baseline with negligible increase in run time. We concatenate multiple pre-trained embeddings to strengthen our representation of words. We show that this concatenation technique works across many tasks, datasets, and model types. We analyze aspects of pre-trained embedding similarity and vocabulary coverage and find that the representational diversity between different pre-trained embeddings is the driving force of why this technique works. We provide open source implementations of our models in both TensorFlow and PyTorch.