论文标题
Hyperpelt:语言和视觉和语言任务的统一参数效率语言模型调整
HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks
论文作者
论文摘要
训练和微调的工作流程已成为解决各种NLP和V&L(视觉和语言)下游任务的流行范式。随着预告片模型迅速增长的能力,如何执行参数有效的微调对于快速转移学习和部署变得相当重要。在本文中,我们设计了一个新颖的统一参数传输学习框架,该框架在纯语言和V&L任务上都有效地工作。特别是,我们使用共享的超级net工作,将可训练的超插件作为输入,并输出重量以微调不同的语言模型中的不同小模块,例如将插入多头注意块(即前缀键入)的参数调整为插入的参数(即,前缀键 - 键)和馈电障碍物(即馈电块)(即Adapter-apapter-tununing)。我们将一组嵌入(例如,层,块,任务和视觉嵌入)定义为计算超嵌入的关键组件,因此可以同时支持纯语言和V&L任务。与最先进的方法相比,我们提出的框架在多任务学习中增加了更少的可训练参数,同时获得了卓越的性能和转移能力。胶水基准和多个V&L任务的经验结果证实了我们框架对文本和视觉方式的有效性。
The workflow of pretraining and fine-tuning has emerged as a popular paradigm for solving various NLP and V&L (Vision-and-Language) downstream tasks. With the capacity of pretrained models growing rapidly, how to perform parameter-efficient fine-tuning has become fairly important for quick transfer learning and deployment. In this paper, we design a novel unified parameter-efficient transfer learning framework that works effectively on both pure language and V&L tasks. In particular, we use a shared hypernetwork that takes trainable hyper-embeddings as input, and outputs weights for fine-tuning different small modules in a pretrained language model, such as tuning the parameters inserted into multi-head attention blocks (i.e., prefix-tuning) and feed-forward blocks (i.e., adapter-tuning). We define a set of embeddings (e.g., layer, block, task and visual embeddings) as the key components to calculate hyper-embeddings, which thus can support both pure language and V&L tasks. Our proposed framework adds fewer trainable parameters in multi-task learning while achieving superior performances and transfer ability compared to state-of-the-art methods. Empirical results on the GLUE benchmark and multiple V&L tasks confirm the effectiveness of our framework on both textual and visual modalities.