具有单个目标域的元学习可转移表示

论文标题

具有单个目标域的元学习可转移表示

Meta-learning Transferable Representations with a Single Target Domain

论文作者

Liu, Hong, HaoChen, Jeff Z., Wei, Colin, Ma, Tengyu

论文摘要

最近的著作发现，微调和联合培训---转移学习的两种流行方法---并不总是提高下游任务的准确性。首先，我们的目标是更多地了解何时以及为什么微调和为什么对转移学习有害。我们设计了半合成数据集，可以通过特定于源的功能或可转移的功能来解决源任务。我们观察到（1）预训练可能没有动机学习可转移的特征，并且（2）联合培训可以同时学习特定于源的特征并过于努力。其次，为了改善微调和联合培训，我们建议元代表学习（MERLIN）学习可转移的功能。 Merlin Meta-Learns表示形式通过确保具有目标训练数据在表示形式之上的头部拟合在目标验证数据上也表现良好。我们还证明，梅林（Merlin）通过二次神经净参数化恢复了目标地面真相模型，并恢复了包含可转移和特定于源特征的源分布。在相同的分布上，训练和联合培训证明无法学习可转移的功能。梅林（Merlin）在经验上胜过以前的最先进的转移学习算法，这些学习算法和NLP转移学习基准。

Recent works found that fine-tuning and joint training---two popular approaches for transfer learning---do not always improve accuracy on downstream tasks. First, we aim to understand more about when and why fine-tuning and joint training can be suboptimal or even harmful for transfer learning. We design semi-synthetic datasets where the source task can be solved by either source-specific features or transferable features. We observe that (1) pre-training may not have incentive to learn transferable features and (2) joint training may simultaneously learn source-specific features and overfit to the target. Second, to improve over fine-tuning and joint training, we propose Meta Representation Learning (MeRLin) to learn transferable features. MeRLin meta-learns representations by ensuring that a head fit on top of the representations with target training data also performs well on target validation data. We also prove that MeRLin recovers the target ground-truth model with a quadratic neural net parameterization and a source distribution that contains both transferable and source-specific features. On the same distribution, pre-training and joint training provably fail to learn transferable features. MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题