通过降低模型的梯度多样性来改善半监督联合学习

论文标题

通过降低模型的梯度多样性来改善半监督联合学习

Improving Semi-supervised Federated Learning by Reducing the Gradient Diversity of Models

论文作者

Zhang, Zhengming, Yang, Yaoqing, Yao, Zhewei, Yan, Yujun, Gonzalez, Joseph E., Mahoney, Michael W.

论文摘要

联合学习（FL）是使用移动设备的计算能力的一种有希望的方式，同时保持用户的隐私。但是，FL中的当前工作使得不切实际的假设是用户在其设备上具有地面标签，同时也假设服务器既没有数据也没有标签。在这项工作中，我们考虑了用户仅具有未标记数据的更现实的场景，而服务器具有一些标记的数据，并且标记的数据量小于未标记数据的数量。我们称此学习问题半监督联合学习（SSFL）。对于SSFL，我们证明了影响测试准确性的关键问题是来自不同用户的模型的较大梯度多样性。基于此，我们研究了几种设计选择。首先，我们发现所谓的一致性正规化损失（CRL）被广泛用于半监督学习中，表现良好，但具有较大的梯度多样性。其次，我们发现批发归一化（BN）增加了梯度多样性。用最近提供的组归一化（GN）代替BN可以降低梯度多样性并提高测试准确性。第三，我们表明，当用户数量较大时，与GN相结合的CRL仍然具有较大的梯度多样性。基于这些结果，我们提出了一种基于新型的基于分组的模型平均方法来替换FedAvg平均方法。总体而言，我们基于分组的平均值与GN和CRL相结合，在同一设置中（> 10 \％），而是四种受监督的FL算法，而不是在SSFL上获得的现代论文更好。

Federated learning (FL) is a promising way to use the computing power of mobile devices while maintaining the privacy of users. Current work in FL, however, makes the unrealistic assumption that the users have ground-truth labels on their devices, while also assuming that the server has neither data nor labels. In this work, we consider the more realistic scenario where the users have only unlabeled data, while the server has some labeled data, and where the amount of labeled data is smaller than the amount of unlabeled data. We call this learning problem semi-supervised federated learning (SSFL). For SSFL, we demonstrate that a critical issue that affects the test accuracy is the large gradient diversity of the models from different users. Based on this, we investigate several design choices. First, we find that the so-called consistency regularization loss (CRL), which is widely used in semi-supervised learning, performs reasonably well but has large gradient diversity. Second, we find that Batch Normalization (BN) increases gradient diversity. Replacing BN with the recently-proposed Group Normalization (GN) can reduce gradient diversity and improve test accuracy. Third, we show that CRL combined with GN still has a large gradient diversity when the number of users is large. Based on these results, we propose a novel grouping-based model averaging method to replace the FedAvg averaging method. Overall, our grouping-based averaging, combined with GN and CRL, achieves better test accuracy than not just a contemporary paper on SSFL in the same settings (>10\%), but also four supervised FL algorithms.

下载PDF全文

下载文献需遵守相关版权规定

论文标题