论文标题
连续对话学习的层次归纳转移
Hierarchical Inductive Transfer for Continual Dialogue Learning
论文作者
论文摘要
预训练的模型在对话任务上取得了出色的表现。但是,对于不断增加在线聊天场景的情况,直接为每个新任务进行微调这些模型不仅会爆炸对话系统在嵌入式设备上的能力,而且还会导致在多样化对话任务中遗忘的知识忘记了知识的知识和知识干扰。在这项工作中,我们提出了一个分层归纳转移框架,以不断有效地学习和部署对话技能。首先,我们将适配器模块介绍给学习新对话任务的预训练模型。作为唯一可训练的模块,对于嵌入式设备的对话系统是有益的,可以使用可忽略的其他参数获得新的对话技能。然后,为了减轻任务之间的知识干扰却使它们之间的正则化受益,我们进一步设计了层次归纳转移,使新任务能够在基本适配器中使用常识,而不会被特定于任务适配器中的多样化知识误导。经验评估和分析表明,我们的框架在部署友好的模型容量下获得了可比的性能。
Pre-trained models have achieved excellent performance on the dialogue task. However, for the continual increase of online chit-chat scenarios, directly fine-tuning these models for each of the new tasks not only explodes the capacity of the dialogue system on the embedded devices but also causes knowledge forgetting on pre-trained models and knowledge interference among diverse dialogue tasks. In this work, we propose a hierarchical inductive transfer framework to learn and deploy the dialogue skills continually and efficiently. First, we introduce the adapter module into pre-trained models for learning new dialogue tasks. As the only trainable module, it is beneficial for the dialogue system on the embedded devices to acquire new dialogue skills with negligible additional parameters. Then, for alleviating knowledge interference between tasks yet benefiting the regularization between them, we further design hierarchical inductive transfer that enables new tasks to use general knowledge in the base adapter without being misled by diverse knowledge in task-specific adapters. Empirical evaluation and analysis indicate that our framework obtains comparable performance under deployment-friendly model capacity.