了解和改善多任务学习中的信息传输

论文标题

了解和改善多任务学习中的信息传输

Understanding and Improving Information Transfer in Multi-Task Learning

论文作者

Wu, Sen, Zhang, Hongyang R., Ré, Christopher

论文摘要

我们研究了使用共享功能表示所有任务的多任务学习方法。为了更好地理解任务信息的传输，我们研究了一个用于所有任务的共享模块的体系结构，并为每个任务提供一个单独的输出模块。我们在线性和恢复激活模型上研究了这种设置的理论。我们的主要观察结果是，任务的数据是否良好，可以显着影响多任务学习的表现。我们表明，任务数据之间的错位可能会导致负转移（或损害性能），并为正转移提供足够的条件。受理论见解的启发，我们表明，对齐任务的嵌入层会导致在胶水基准和情感分析任务上进行多任务培训和转移学习的性能；例如，我们使用我们的对准方法获得了5个胶水任务的胶水得分的平均值2.35％。我们还设计了一个基于SVD的任务重新加权方案，并表明它提高了多标签图像数据集上的多任务培训的鲁棒性。

We investigate multi-task learning approaches that use a shared feature representation for all tasks. To better understand the transfer of task information, we study an architecture with a shared module for all tasks and a separate output module for each task. We study the theory of this setting on linear and ReLU-activated models. Our key observation is that whether or not tasks' data are well-aligned can significantly affect the performance of multi-task learning. We show that misalignment between task data can cause negative transfer (or hurt performance) and provide sufficient conditions for positive transfer. Inspired by the theoretical insights, we show that aligning tasks' embedding layers leads to performance gains for multi-task training and transfer learning on the GLUE benchmark and sentiment analysis tasks; for example, we obtain a 2.35% GLUE score average improvement on 5 GLUE tasks over BERT-LARGE using our alignment method. We also design an SVD-based task reweighting scheme and show that it improves the robustness of multi-task training on a multi-label image dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题