关于预培训对于联合学习的重要性和适用性

论文标题

关于预培训对于联合学习的重要性和适用性

On the Importance and Applicability of Pre-Training for Federated Learning

论文作者

Chen, Hong-You, Tu, Cheng-Hao, Li, Ziwei, Shen, Han-Wei, Chao, Wei-Lun

论文摘要

如今，预训练很普遍，以提高学习模型的性能。但是，在有关联邦学习（FL）的文献中，神经网络主要以随机权重初始化。这些吸引了我们对进行系统研究以探索FL培训的兴趣。在多个视觉识别基准中，我们发现预训练不仅可以改善FL，而且可以缩小其准确性差距与对应的集中学习，尤其是在非IID客户数据的具有挑战性的情况下。为了使我们的发现适用于未直接训练的模型的情况，我们以合成数据或以分散的方式探索了使用合成数据的预培训，并发现它们已经可以显着改善FL。有趣的是，我们探索的许多技术都是相互补充的，以进一步提高性能，我们认为这是为实现现实世界应用扩展FL的关键结果。我们结束了论文，以了解预培训对FL的影响。我们发现，预培训使在不同客户的数据条件下学习的全球模型能够收敛到相同的损失盆地，并使FL中的全球聚集更加稳定。然而，预训练似乎并不能减轻当地模型漂移，这是非IID数据下FL的基本问题。

Pre-training is prevalent in nowadays deep learning to improve the learned model's performance. However, in the literature on federated learning (FL), neural networks are mostly initialized with random weights. These attract our interest in conducting a systematic study to explore pre-training for FL. Across multiple visual recognition benchmarks, we found that pre-training can not only improve FL, but also close its accuracy gap to the counterpart centralized learning, especially in the challenging cases of non-IID clients' data. To make our findings applicable to situations where pre-trained models are not directly available, we explore pre-training with synthetic data or even with clients' data in a decentralized manner, and found that they can already improve FL notably. Interestingly, many of the techniques we explore are complementary to each other to further boost the performance, and we view this as a critical result toward scaling up deep FL for real-world applications. We conclude our paper with an attempt to understand the effect of pre-training on FL. We found that pre-training enables the learned global models under different clients' data conditions to converge to the same loss basin, and makes global aggregation in FL more stable. Nevertheless, pre-training seems to not alleviate local model drifting, a fundamental problem in FL under non-IID data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题