用统一的多语言提示，零射击基于及时的调谐

论文标题

用统一的多语言提示，零射击基于及时的调谐

Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

论文作者

Huang, Lianzhe, Ma, Shuming, Zhang, Dongdong, Wei, Furu, Wang, Houfeng

论文摘要

迅速的调整已被证明对验证的语言模型（PLM）有效。虽然大多数现有工作都集中在单语提示上，但我们研究了多语言PLM的多语言提示，尤其是在零拍的跨语性设置中。为了减轻为多种语言设计不同提示的努力，我们提出了一个新型模型，该模型使用统一的提示，对所有语言（称为Uniprompt）。与离散的提示和软提示不同，统一提示是基于模型的和语言敏捷的。具体而言，统一提示是由多语言PLM初始化的，以产生与语言无关的表示，之后与文本输入融合。在推断期间，可以预先计算提示，因此不需要额外的计算成本。为了与统一的提示相交，我们为目标标签单词提出了一种新的初始化方法，以进一步提高模型在语言中的可传递性。广泛的实验表明，我们提出的方法可以大大优于不同语言的强基线。我们发布数据和代码以促进未来的研究。

Prompt-based tuning has been proven effective for pretrained language models (PLMs). While most of the existing work focuses on the monolingual prompts, we study the multilingual prompts for multilingual PLMs, especially in the zero-shot cross-lingual setting. To alleviate the effort of designing different prompts for multiple languages, we propose a novel model that uses a unified prompt for all languages, called UniPrompt. Different from the discrete prompts and soft prompts, the unified prompt is model-based and language-agnostic. Specifically, the unified prompt is initialized by a multilingual PLM to produce language-independent representation, after which is fused with the text input. During inference, the prompts can be pre-computed so that no extra computation cost is needed. To collocate with the unified prompt, we propose a new initialization method for the target label word to further improve the model's transferability across languages. Extensive experiments show that our proposed methods can significantly outperform the strong baselines across different languages. We release data and code to facilitate future research.

下载PDF全文

下载文献需遵守相关版权规定

论文标题