输入调整：将不熟悉的输入调整为冷冻预验证的模型

论文标题

输入调整：将不熟悉的输入调整为冷冻预验证的模型

Input-Tuning: Adapting Unfamiliar Inputs to Frozen Pretrained Models

论文作者

An, Shengnan, Li, Yifei, Lin, Zeqi, Liu, Qian, Chen, Bei, Fu, Qiang, Chen, Weizhu, Zheng, Nanning, Lou, Jian-Guang

论文摘要

最近，迅速调整范式引起了极大的关注。通过使用冷冻预训练的语言模型（PLM）调整连续提示，及时调整迈向部署共享冷冻的PLM的一步，以服务许多下游任务。尽管迅速调整在某些自然语言理解（NLU）任务上表现出良好的表现，但其对自然语言生成（NLG）任务的有效性仍然不足。在本文中，我们认为阻碍NLG任务迅速开发的因素之一是不熟悉的输入（即，输入在语言上与预处理语料库有所不同）。例如，我们的初步探索揭示了当NLG任务中经常出现不熟悉的输入时，及时调整和微调之间存在较大的性能差距。这促使我们提出输入调整，这既可以调节连续的提示和输入表示形式，从而导致了一种更有效的方法，使不熟悉的输入适应了冷冻的PLM。我们提出的输入调整在概念上是简单且经验上强大的。对七个NLG任务的实验结果表明，输入调整比迅速调整要好得多。此外，在这三个任务上，输入调整可以比微调实现可比甚至更好的性能。

Recently the prompt-tuning paradigm has attracted significant attention. By only tuning continuous prompts with a frozen pre-trained language model (PLM), prompt-tuning takes a step towards deploying a shared frozen PLM to serve numerous downstream tasks. Although prompt-tuning shows good performance on certain natural language understanding (NLU) tasks, its effectiveness on natural language generation (NLG) tasks is still under-explored. In this paper, we argue that one of the factors hindering the development of prompt-tuning on NLG tasks is the unfamiliar inputs (i.e., inputs are linguistically different from the pretraining corpus). For example, our preliminary exploration reveals a large performance gap between prompt-tuning and fine-tuning when unfamiliar inputs occur frequently in NLG tasks. This motivates us to propose input-tuning, which fine-tunes both the continuous prompts and the input representations, leading to a more effective way to adapt unfamiliar inputs to frozen PLMs. Our proposed input-tuning is conceptually simple and empirically powerful. Experimental results on seven NLG tasks demonstrate that input-tuning is significantly and consistently better than prompt-tuning. Furthermore, on three of these tasks, input-tuning can achieve a comparable or even better performance than fine-tuning.

下载PDF全文

下载文献需遵守相关版权规定

论文标题