论文标题

输入调整:将不熟悉的输入调整为冷冻预验证的模型

Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models

论文作者

An, Shengnan, Li, Yifei, Lin, Zeqi, Liu, Qian, Chen, Bei, Fu, Qiang, Chen, Weizhu, Zheng, Nanning, Lou, Jian-Guang

论文摘要

最近,迅速调整范式引起了极大的关注。通过使用冷冻预训练的语言模型(PLM)调整连续提示,及时调整迈向部署共享冷冻的PLM的一步,以服务许多下游任务。尽管迅速调整在某些自然语言理解(NLU)任务上表现出良好的表现,但其对自然语言生成(NLG)任务的有效性仍然不足。在本文中,我们认为阻碍NLG任务迅速开发的因素之一是不熟悉的输入(即,输入在语言上与预处理语料库有所不同)。例如,我们的初步探索揭示了当NLG任务中经常出现不熟悉的输入时,及时调整和微调之间存在较大的性能差距。这促使我们提出输入调整,这既可以调节连续的提示和输入表示形式,从而导致了一种更有效的方法,使不熟悉的输入适应了冷冻的PLM。我们提出的输入调整在概念上是简单且经验上强大的。对七个NLG任务的实验结果表明,输入调整比迅速调整要好得多。此外,在这三个任务上,输入调整可以比微调实现可比甚至更好的性能。

Recently the prompt-tuning paradigm has attracted significant attention. By only tuning continuous prompts with a frozen pre-trained language model (PLM), prompt-tuning takes a step towards deploying a shared frozen PLM to serve numerous downstream tasks. Although prompt-tuning shows good performance on certain natural language understanding (NLU) tasks, its effectiveness on natural language generation (NLG) tasks is still under-explored. In this paper, we argue that one of the factors hindering the development of prompt-tuning on NLG tasks is the unfamiliar inputs (i.e., inputs are linguistically different from the pretraining corpus). For example, our preliminary exploration reveals a large performance gap between prompt-tuning and fine-tuning when unfamiliar inputs occur frequently in NLG tasks. This motivates us to propose input-tuning, which fine-tunes both the continuous prompts and the input representations, leading to a more effective way to adapt unfamiliar inputs to frozen PLMs. Our proposed input-tuning is conceptually simple and empirically powerful. Experimental results on seven NLG tasks demonstrate that input-tuning is significantly and consistently better than prompt-tuning. Furthermore, on three of these tasks, input-tuning can achieve a comparable or even better performance than fine-tuning.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源