论文标题
FedGS:与任意客户可用性的联合基于图的采样
FedGS: Federated Graph-based Sampling with Arbitrary Client Availability
论文作者
论文摘要
尽管联邦学习在优化机器学习模型而不直接访问原始数据方面表现出强大的结果,但间歇性客户的可用性可能会阻碍其性能,从而减慢收敛性并偏向最终学习的模型。在任意客户可用性下,要实现稳定和无偏见的培训面临重大挑战。为了应对这些挑战,我们提出了一个名为Federated基于图的采样(FedGS)的框架,以稳定全局模型更新并同时减轻给定任意客户可用性的长期偏差。首先,我们使用数据分布依赖性图(3DG)对客户的数据相关性进行建模,该图有助于将采样的客户数据彼此区分开来,从理论上讲,这可以改善最佳模型更新的近似值。其次,受到采样客户的数据分布的远距离限制,我们进一步最大程度地减少了对客户的采样次数的差异,以减轻长期偏见。为了验证FedG的有效性,我们在七个客户可用性模式的全面集合下对三个数据集进行了实验。我们的实验结果证实了FedGs在实现公平客户采样方案和在任意客户可用性下改善模型性能的优势。我们的代码可在\ url {https://github.com/wwzzz/fedgs}上找到。
While federated learning has shown strong results in optimizing a machine learning model without direct access to the original data, its performance may be hindered by intermittent client availability which slows down the convergence and biases the final learned model. There are significant challenges to achieve both stable and bias-free training under arbitrary client availability. To address these challenges, we propose a framework named Federated Graph-based Sampling (FedGS), to stabilize the global model update and mitigate the long-term bias given arbitrary client availability simultaneously. First, we model the data correlations of clients with a Data-Distribution-Dependency Graph (3DG) that helps keep the sampled clients data apart from each other, which is theoretically shown to improve the approximation to the optimal model update. Second, constrained by the far-distance in data distribution of the sampled clients, we further minimize the variance of the numbers of times that the clients are sampled, to mitigate long-term bias. To validate the effectiveness of FedGS, we conduct experiments on three datasets under a comprehensive set of seven client availability modes. Our experimental results confirm FedGS's advantage in both enabling a fair client-sampling scheme and improving the model performance under arbitrary client availability. Our code is available at \url{https://github.com/WwZzz/FedGS}.