论文标题
大规模联邦学习的评估框架
Evaluation Framework For Large-scale Federated Learning
论文作者
论文摘要
提议联合学习作为机器学习设置,以启用分布式边缘设备,例如移动电话,以协作学习共享的预测模型,同时将所有培训数据保留在设备上,这不仅可以充分利用数百万个节点分配的数据来培训良好模型,还可以保护数据隐私。但是,在上面的方案中学习会带来新的挑战。实际上,大量不可靠设备的数据可能是非IID(相同和独立分布),这可能会使通过联邦学习不稳定训练的模型的性能。在本文中,我们介绍了一个专为大规模联合学习而设计的框架,该框架包括生成数据集和模块化评估框架的方法。首先,我们通过提供三个方面,包括协变量移动,先前的概率变化和概念转移,构建一组开源非IID数据集,这些方面是基于现实世界中的假设。此外,我们设计了几个严格的评估指标,包括网络节点的数量,数据集的大小,通信回合的数量和通信资源等。最后,我们为大规模联邦学习研究提供了开源基准。
Federated learning is proposed as a machine learning setting to enable distributed edge devices, such as mobile phones, to collaboratively learn a shared prediction model while keeping all the training data on device, which can not only take full advantage of data distributed across millions of nodes to train a good model but also protect data privacy. However, learning in scenario above poses new challenges. In fact, data across a massive number of unreliable devices is likely to be non-IID (identically and independently distributed), which may make the performance of models trained by federated learning unstable. In this paper, we introduce a framework designed for large-scale federated learning which consists of approaches to generating dataset and modular evaluation framework. Firstly, we construct a suite of open-source non-IID datasets by providing three respects including covariate shift, prior probability shift, and concept shift, which are grounded in real-world assumptions. In addition, we design several rigorous evaluation metrics including the number of network nodes, the size of datasets, the number of communication rounds and communication resources etc. Finally, we present an open-source benchmark for large-scale federated learning research.