论文标题
任务指导的验证语言模型的调整
Task-guided Disentangled Tuning for Pretrained Language Models
论文作者
论文摘要
在大规模未标记的语料库中训练的训练有素的语言模型(PLM)通常在特定于任务的下游数据集中进行微调,这些数据集对各种NLP任务产生了最新的结果。但是,域和规模中的数据差异问题使微调无法有效捕获特定于任务的模式,尤其是在低数据制度中。为了解决这个问题,我们建议对PLM的任务引导的解开调整(TDT),从而通过将与任务相关的信号从纠缠表示形式中解开来增强表示形式的概括。对于给定的任务,我们引入了一个可学习的置信度模型,以从上下文中检测指示性指南,并进一步提出一个分离的正规化以减轻过度依赖问题。胶水和线索基准的实验结果表明,与使用不同的PLM的微调相比,TDT给出的结果始终如一,并且广泛的分析证明了我们方法的有效性和鲁棒性。代码可在https://github.com/lemon0830/tdt上找到。
Pretrained language models (PLMs) trained on large-scale unlabeled corpus are typically fine-tuned on task-specific downstream datasets, which have produced state-of-the-art results on various NLP tasks. However, the data discrepancy issue in domain and scale makes fine-tuning fail to efficiently capture task-specific patterns, especially in the low data regime. To address this issue, we propose Task-guided Disentangled Tuning (TDT) for PLMs, which enhances the generalization of representations by disentangling task-relevant signals from the entangled representations. For a given task, we introduce a learnable confidence model to detect indicative guidance from context, and further propose a disentangled regularization to mitigate the over-reliance problem. Experimental results on GLUE and CLUE benchmarks show that TDT gives consistently better results than fine-tuning with different PLMs, and extensive analysis demonstrates the effectiveness and robustness of our method. Code is available at https://github.com/lemon0830/TDT.