小塔有很大的差异

论文标题

小塔有很大的差异

Small Towers Make Big Differences

论文作者

Wang, Yuyan, Zhao, Zhe, Dai, Bo, Fifty, Christopher, Lin, Dong, Hong, Lichan, Chi, Ed H.

论文摘要

多任务学习旨在同时解决多个机器学习任务。除了帕累托最佳之外，还可以推广到多任务学习问题的良好解决方案。在本文中，我们提供了一些有关理解帕累托效率和概括之间因多任务深度学习模型的参数化而折衷的见解。作为一个多目标优化问题，需要足够的参数化来处理受约束的解决方案空间中的任务冲突；但是，从多任务概括的角度来看，过度参数化破坏了学习共享表示形式的好处，该表示有助于使用有限的培训示例来帮助更艰难的任务或任务。因此，在效率和概括之间找到更好的权衡，需要在多任务概括和多目标优化之间取得微妙的平衡。为此，我们提出了一种用于多任务模型的参数化的自我缩营者的方法，以实现两全其美。它是任务不合时宜的，可与其他多任务学习算法一起使用。经验结果表明，参数较少的自我卫生型的小塔在提高各种多任务应用程序中的帕累托效率方面存在很大差异。

Multi-task learning aims at solving multiple machine learning tasks at the same time. A good solution to a multi-task learning problem should be generalizable in addition to being Pareto optimal. In this paper, we provide some insights on understanding the trade-off between Pareto efficiency and generalization as a result of parameterization in multi-task deep learning models. As a multi-objective optimization problem, enough parameterization is needed for handling task conflicts in a constrained solution space; however, from a multi-task generalization perspective, over-parameterization undermines the benefit of learning a shared representation which helps harder tasks or tasks with limited training examples. A delicate balance between multi-task generalization and multi-objective optimization is therefore needed for finding a better trade-off between efficiency and generalization. To this end, we propose a method of under-parameterized self-auxiliaries for multi-task models to achieve the best of both worlds. It is task-agnostic and works with other multi-task learning algorithms. Empirical results show that small towers of under-parameterized self-auxiliaries can make big differences in improving Pareto efficiency in various multi-task applications.

下载PDF全文

下载文献需遵守相关版权规定

论文标题