从记忆中解耦知识：检索提示及时学习

论文标题

从记忆中解耦知识：检索提示及时学习

Decoupling Knowledge from Memorization: Retrieval-augmented Prompt Learning

论文作者

Chen, Xiang, Li, Lei, Zhang, Ningyu, Liang, Xiaozhuan, Deng, Shumin, Tan, Chuanqi, Huang, Fei, Si, Luo, Chen, Huajun

论文摘要

迅速的学习方法通过诱导更好的几次表现，在仍然遵循基于参数的学习范式的同时，引起了自然语言处理的波动。学习中的遗忘和死记硬背的记忆问题可能会遇到不稳定的概括问题。具体而言，香草及时的学习可能难以利用死记硬背的非典型实例，并具有低射击数据的过度训练或过度浅的模式。为了减轻此类局限性，我们以将知识从记忆中解耦的动机发展为有助于模型在概括和记忆之间取得平衡。与香草及时学习相反，重新启动构造了培训实例中的开放式知识库，并在输入，培训和推理过程中实现了检索机制，从而使模型能够从培训语料库中获取相关环境作为提高培训语料库的能力。广泛的实验表明，逆转趋势可以在几次射击和零拍设置中获得更好的性能。此外，我们进一步说明，我们提出的撤退可以通过新数据集获得更好的概括能力。记忆的详细分析确实表明，逆转可以减少语言模型对记忆的依赖。因此，改善下游任务的概括。代码可在https://github.com/zjunlp/promptkg/tree/main/research/retroprompt中找到。

Prompt learning approaches have made waves in natural language processing by inducing better few-shot performance while they still follow a parametric-based learning paradigm; the oblivion and rote memorization problems in learning may encounter unstable generalization issues. Specifically, vanilla prompt learning may struggle to utilize atypical instances by rote during fully-supervised training or overfit shallow patterns with low-shot data. To alleviate such limitations, we develop RetroPrompt with the motivation of decoupling knowledge from memorization to help the model strike a balance between generalization and memorization. In contrast with vanilla prompt learning, RetroPrompt constructs an open-book knowledge-store from training instances and implements a retrieval mechanism during the process of input, training and inference, thus equipping the model with the ability to retrieve related contexts from the training corpus as cues for enhancement. Extensive experiments demonstrate that RetroPrompt can obtain better performance in both few-shot and zero-shot settings. Besides, we further illustrate that our proposed RetroPrompt can yield better generalization abilities with new datasets. Detailed analysis of memorization indeed reveals RetroPrompt can reduce the reliance of language models on memorization; thus, improving generalization for downstream tasks. Code is available in https://github.com/zjunlp/PromptKG/tree/main/research/RetroPrompt.

下载PDF全文

下载文献需遵守相关版权规定

论文标题