在拜占庭式攻击下的空中联邦学习

论文标题

在拜占庭式攻击下的空中联邦学习

Over-The-Air Federated Learning under Byzantine Attacks

论文作者

Sifaou, Houssem, Li, Geoffrey Ye

论文摘要

联合学习（FL）是一个有前途的解决方案，可以启用许多AI应用程序，其中需要来自分布式客户端的敏感数据集来协作培训全球模型。 FL允许客户参与由中央服务器约束的培训阶段，而无需共享本地数据。 FL的主要挑战之一是沟通开销，在每个全球培训回合中，参与客户的模型更新被发送到中央服务器。最近提出了空中计算（AIRCOMP），以减轻模型更新通过多访问频道同时发送的通信瓶颈。但是，通过AIRCOMP对模型更新的简单平均使学习过程容易受到某些拜占庭端客户的本地模型更新的随机或预期修改。在本文中，我们提出了一个传输和聚合框架，以减少此类攻击的效果，同时保留AirComp对FL的好处。对于提出的强大方法，中央服务器将参与端的客户随机分为组，并为每个组分配一个传输时间插槽。然后使用强大的聚合技术汇总不同组的更新。我们扩展了处理非i.i.d案例的方法。本地数据，其中在鲁棒聚合之前添加重采样步骤。我们分析了两种I.I.D.的拟议方法的收敛性。和non-i.i.d。数据并证明所提出的算法以线性速率收敛到最佳解决方案的邻域。提供了实际数据集上的实验，以确认所提出方法的鲁棒性。

Federated learning (FL) is a promising solution to enable many AI applications, where sensitive datasets from distributed clients are needed for collaboratively training a global model. FL allows the clients to participate in the training phase, governed by a central server, without sharing their local data. One of the main challenges of FL is the communication overhead, where the model updates of the participating clients are sent to the central server at each global training round. Over-the-air computation (AirComp) has been recently proposed to alleviate the communication bottleneck where the model updates are sent simultaneously over the multiple-access channel. However, simple averaging of the model updates via AirComp makes the learning process vulnerable to random or intended modifications of the local model updates of some Byzantine clients. In this paper, we propose a transmission and aggregation framework to reduce the effect of such attacks while preserving the benefits of AirComp for FL. For the proposed robust approach, the central server divides the participating clients randomly into groups and allocates a transmission time slot for each group. The updates of the different groups are then aggregated using a robust aggregation technique. We extend our approach to handle the case of non-i.i.d. local data, where a resampling step is added before robust aggregation. We analyze the convergence of the proposed approach for both cases of i.i.d. and non-i.i.d. data and demonstrate that the proposed algorithm converges at a linear rate to a neighborhood of the optimal solution. Experiments on real datasets are provided to confirm the robustness of the proposed approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题