通过无数据知识蒸馏进行微调全球模型，用于非IID联合学习

论文标题

通过无数据知识蒸馏进行微调全球模型，用于非IID联合学习

Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning

论文作者

Zhang, Lin, Shen, Li, Ding, Liang, Tao, Dacheng, Duan, Ling-Yu

论文摘要

在隐私限制下，联合学习（FL）是一种新兴的分布式学习范式。数据异质性是FL的主要挑战之一，这导致收敛缓慢和降解性能。大多数现有方法仅通过限制客户端的本地模型更新来应对异质性挑战，而忽略了直接全局模型聚合导致的性能下降。取而代之的是，我们提出了一种无数据知识蒸馏方法，以微调服务器中的全局模型（FEDFTG），从而减轻了直接模型聚合的问题。具体而言，FEDFTG通过发电机探索了本地模型的输入空间，并使用它将知识从本地模型转移到全局模型。此外，我们提出了一个硬采矿计划，以在整个培训中实现有效的知识蒸馏。此外，我们开发了自定义的标签采样和类级集合，以最大程度地利用知识，这隐含地减轻了客户端的分布差异。广泛的实验表明，我们的FEDFTG显着优于最先进的FL算法（SOTA），并且可以用作强大的插件，以增强FedAvg，FedProx，Feddyn和脚手架。

Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint. Data heterogeneity is one of the main challenges in FL, which results in slow convergence and degraded performance. Most existing approaches only tackle the heterogeneity challenge by restricting the local model update in client, ignoring the performance drop caused by direct global model aggregation. Instead, we propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG), which relieves the issue of direct model aggregation. Concretely, FedFTG explores the input space of local models through a generator, and uses it to transfer the knowledge from local models to the global model. Besides, we propose a hard sample mining scheme to achieve effective knowledge distillation throughout the training. In addition, we develop customized label sampling and class-level ensemble to derive maximum utilization of knowledge, which implicitly mitigates the distribution discrepancy across clients. Extensive experiments show that our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.

下载PDF全文

下载文献需遵守相关版权规定

论文标题