论文标题

TIFL:基于层的联合学习系统

TiFL: A Tier-based Federated Learning System

论文作者

Chai, Zheng, Ali, Ahsan, Zawad, Syed, Truex, Stacey, Anwar, Ali, Baracaldo, Nathalie, Zhou, Yi, Ludwig, Heiko, Yan, Feng, Cheng, Yue

论文摘要

联合学习(FL)使许多客户都可以学习共享模型,而无需违反隐私要求。 FL中的关键属性之一是由于计算和通信能力的差异以及不同客户之间数据的数量和内容,资源和数据中存在的异质性。我们进行了一项案例研究,以表明资源的异质性和数据对常规FL系统的训练时间和模型准确性有重大影响。为此,我们提出了一种基于层的联合学习系统TIFL,该系统将根据其培训绩效将客户分为层,并在每个培训回合中从同一层中选择客户,以减轻资源和数据数量中异质性引起的Straggler问题。为了进一步驯服非IID(独立和相同的分布)数据和资源引起的异质性,TIFL采用自适应层选择方法来根据观察到的培训绩效和准确性的加时性来更新层次。我们在Google的FL架构之后的FL测试中原型TIFL原型TIFL,并使用流行的基准测试和最先进的FL基准叶对其进行了评估。实验评估表明,在各种异质条件下,TIFL优于常规FL。通过提出的自适应层选择政策,我们证明了TIFL可以实现更快的训练性能,同时保持相同的测试准确性。

Federated Learning (FL) enables learning a shared model across many clients without violating the privacy requirements. One of the key attributes in FL is the heterogeneity that exists in both resource and data due to the differences in computation and communication capacity, as well as the quantity and content of data among different clients. We conduct a case study to show that heterogeneity in resource and data has a significant impact on training time and model accuracy in conventional FL systems. To this end, we propose TiFL, a Tier-based Federated Learning System, which divides clients into tiers based on their training performance and selects clients from the same tier in each training round to mitigate the straggler problem caused by heterogeneity in resource and data quantity. To further tame the heterogeneity caused by non-IID (Independent and Identical Distribution) data and resources, TiFL employs an adaptive tier selection approach to update the tiering on-the-fly based on the observed training performance and accuracy overtime. We prototype TiFL in a FL testbed following Google's FL architecture and evaluate it using popular benchmarks and the state-of-the-art FL benchmark LEAF. Experimental evaluation shows that TiFL outperforms the conventional FL in various heterogeneous conditions. With the proposed adaptive tier selection policy, we demonstrate that TiFL achieves much faster training performance while keeping the same (and in some cases - better) test accuracy across the board.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源