在系统异质性下进行联合图像分类的超级网培训

论文标题

在系统异质性下进行联合图像分类的超级网培训

Supernet Training for Federated Image Classification under System Heterogeneity

论文作者

Kim, Taehyeon, Yun, Se-Young

论文摘要

在存在数据掠夺性保存问题的情况下，在许多设备和资源限制（尤其是在边缘设备上）有效地部署深层神经网络是最具挑战性的问题之一。传统的方法已经发展为改善单个全球模型，同时保持每个本地的异质培训数据分散（即数据异质性；联合学习（FL）），或者训练一个支持各种体系结构设置以解决异质系统的高架网络，以解决具有不同的计算能力（即具有不同的计算能力）（即系统异质建筑搜索）。但是，很少有研究同时考虑这两个方向。本文提出了超级网培训联合会（FEDSUP）框架的联合会，以同时考虑这两个方案，即客户发送和接收一条包含所有可能从其自身采样的架构的超网。该方法的启发是观察到FL模型聚集期间的平均参数类似于超级网训练中的重量分享。因此，提议的FedSup框架结合了一种与FL平均训练单杆模型广泛使用（FedAvg）（FedAvg）。此外，我们通过在广播阶段向客户发送子模型来降低沟通成本和培训间接费用，从而开发出有效的算法（E-FEDSUP），其中包括几种在FL环境中增强超级网络培训的策略。我们通过广泛的经验评估来验证提出的方法。最终的框架还确保了几种标准基准的数据和模型异质性鲁棒性。

Efficient deployment of deep neural networks across many devices and resource constraints, particularly on edge devices, is one of the most challenging problems in the presence of data-privacy preservation issues. Conventional approaches have evolved to either improve a single global model while keeping each local heterogeneous training data decentralized (i.e. data heterogeneity; Federated Learning (FL)) or to train an overarching network that supports diverse architectural settings to address heterogeneous systems equipped with different computational capabilities (i.e. system heterogeneity; Neural Architecture Search). However, few studies have considered both directions simultaneously. This paper proposes the federation of supernet training (FedSup) framework to consider both scenarios simultaneously, i.e., where clients send and receive a supernet that contains all possible architectures sampled from itself. The approach is inspired by observing that averaging parameters during model aggregation for FL is similar to weight-sharing in supernet training. Thus, the proposed FedSup framework combines a weight-sharing approach widely used for training single shot models with FL averaging (FedAvg). Furthermore, we develop an efficient algorithm (E-FedSup) by sending the sub-model to clients on the broadcast stage to reduce communication costs and training overhead, including several strategies to enhance supernet training in the FL environment. We verify the proposed approach with extensive empirical evaluations. The resulting framework also ensures data and model heterogeneity robustness on several standard benchmarks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题