联合学习中可靠模型融合的合奏蒸馏

论文标题

联合学习中可靠模型融合的合奏蒸馏

Ensemble Distillation for Robust Model Fusion in Federated Learning

论文作者

Lin, Tao, Kong, Lingjing, Stich, Sebastian U., Jaggi, Martin

论文摘要

联合学习（FL）是一种机器学习设置，许多设备在此设置中协作训练机器学习模型，同时保持培训数据分散。在当前的大多数培训方案中，中央模型都是通过平均服务器模型的参数和客户端更新的参数来完善的。但是，仅当所有模型都具有相同的结构和大小时，直接平均模型参数才有可能，这在许多情况下可能是限制性的限制。在这项工作中，我们研究了FL的更强大，更灵活的聚合方案。具体而言，我们建议用于模型融合的集合蒸馏，即通过未标记的数据来培训中央分类器，从客户的输出中进行介绍。这种知识蒸馏技术降低了与基线FL算法相同程度的隐私风险和成本，但允许在可能有所不同的异质客户模型上灵活聚集，例如。大小，数值精度或结构。我们在各种CV/NLP数据集（CIFAR-10/100，Imagenet，AG News，SST2）和设置（异质模型/数据）上进行了广泛的经验实验，表明服务器模型可以更快，需要比到目前为止任何现有的FL技术更少的通信回合。

Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model while keeping the training data decentralized. In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side. However, directly averaging model parameters is only possible if all models have the same structure and size, which could be a restrictive constraint in many scenarios. In this work we investigate more powerful and more flexible aggregation schemes for FL. Specifically, we propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients. This knowledge distillation technique mitigates privacy risk and cost to the same extent as the baseline FL algorithms, but allows flexible aggregation over heterogeneous client models that can differ e.g. in size, numerical precision or structure. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10/100, ImageNet, AG News, SST2) and settings (heterogeneous models/data) that the server model can be trained much faster, requiring fewer communication rounds than any existing FL technique so far.

下载PDF全文

下载文献需遵守相关版权规定

论文标题