可口可乐：上下文知识选择并嵌入到增强的预训练的语言模型

论文标题

可口可乐：上下文知识选择并嵌入到增强的预训练的语言模型

CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models

论文作者

Su, Yusheng, Han, Xu, Zhang, Zhengyan, Li, Peng, Liu, Zhiyuan, Lin, Yankai, Zhou, Jie, Sun, Maosong

论文摘要

最近的一些努力致力于通过利用知识图（KGS）中的额外异质知识，并在各种知识驱动的NLP任务上取得一致的改进来增强预训练的语言模型（PLM）。但是，这些知识增强的PLM中的大多数嵌入了kgs的静态子图（“知识上下文”），无论PLM所需的知识都可能根据特定文本（“文本上下文”）动态变化。在本文中，我们提出了一个名为可口可乐的新颖框架，以根据PLM的文本上下文动态选择上下文知识，并嵌入知识上下文，该框架可以避免在kgs中冗余和模棱两可的知识的效果，而这些知识无法与输入文本相匹配。我们的实验结果表明，可口可乐在典型的知识驱动的NLP任务上优于各种基准，这表明将动态知识上下文用于语言理解的有效性。除了绩效的改进外，可乐中动态选择的知识还可以比传统的PLMs以更容易解释的形式描述与文本相关知识的语义。我们的源代码和数据集将用于为可口可乐提供更多详细信息。

Several recent efforts have been devoted to enhancing pre-trained language models (PLMs) by utilizing extra heterogeneous knowledge in knowledge graphs (KGs) and achieved consistent improvements on various knowledge-driven NLP tasks. However, most of these knowledge-enhanced PLMs embed static sub-graphs of KGs ("knowledge context"), regardless of that the knowledge required by PLMs may change dynamically according to specific text ("textual context"). In this paper, we propose a novel framework named Coke to dynamically select contextual knowledge and embed knowledge context according to textual context for PLMs, which can avoid the effect of redundant and ambiguous knowledge in KGs that cannot match the input text. Our experimental results show that Coke outperforms various baselines on typical knowledge-driven NLP tasks, indicating the effectiveness of utilizing dynamic knowledge context for language understanding. Besides the performance improvements, the dynamically selected knowledge in Coke can describe the semantics of text-related knowledge in a more interpretable form than the conventional PLMs. Our source code and datasets will be available to provide more details for Coke.

下载PDF全文

下载文献需遵守相关版权规定

论文标题