从哪里开始？关于联合学习预训练和初始化的影响

论文标题

从哪里开始？关于联合学习预训练和初始化的影响

Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning

论文作者

Nguyen, John, Wang, Jianyu, Malik, Kshitiz, Sanjabi, Maziar, Rabbat, Michael

论文摘要

经常引用的联邦学习挑战是异质性的存在。 \ emph {数据异质性}是指以下事实：来自不同客户的数据可能遵循非常不同的分布。 \ emph {系统异质性}是指具有不同系统功能的客户端设备。相当多的联合优化方法应对这一挑战。在文献中，经验评估通常从随机初始化开始联合培训。但是，在联合学习的许多实际应用中，服务器可以访问培训任务的代理数据，该数据可用于在开始联合培训之前用于预训练模型。使用四个标准的联合学习基准数据集，我们从经验上研究了从联邦学习中的预训练模型开始的影响。毫不奇怪，从预先训练的模型开始减少了达到目标错误率所需的训练时间，并可以比从随机初始化开始时更准确的模型（高达40 \％）的训练（最高40 \％）。令人惊讶的是，我们还发现，从预先训练的初始化中开始联邦学习会降低数据和系统异质性的影响。我们建议未来提出和评估联合优化方法的工作，以评估从随机和预训练的初始化开始时的性能。这项研究提出了几个问题，以进一步了解异质性在联合优化中的作用。 \ footNote {我们的代码可在：\ url {https://github.com/facebookresearch/where_to_begin}}

An oft-cited challenge of federated learning is the presence of heterogeneity. \emph{Data heterogeneity} refers to the fact that data from different clients may follow very different distributions. \emph{System heterogeneity} refers to client devices having different system capabilities. A considerable number of federated optimization methods address this challenge. In the literature, empirical evaluations usually start federated training from random initialization. However, in many practical applications of federated learning, the server has access to proxy data for the training task that can be used to pre-train a model before starting federated training. Using four standard federated learning benchmark datasets, we empirically study the impact of starting from a pre-trained model in federated learning. Unsurprisingly, starting from a pre-trained model reduces the training time required to reach a target error rate and enables the training of more accurate models (up to 40\%) than is possible when starting from random initialization. Surprisingly, we also find that starting federated learning from a pre-trained initialization reduces the effect of both data and system heterogeneity. We recommend future work proposing and evaluating federated optimization methods to evaluate the performance when starting from random and pre-trained initializations. This study raises several questions for further work on understanding the role of heterogeneity in federated optimization. \footnote{Our code is available at: \url{https://github.com/facebookresearch/where_to_begin}}

下载PDF全文

下载文献需遵守相关版权规定

论文标题