论文标题
重点:处理联合学习中的标签质量差异
FOCUS: Dealing with Label Quality Disparity in Federated Learning
论文作者
论文摘要
无处不在的系统越来越多地用于医疗保健应用中。由于孤岛效应和隐私保护,联合学习(FL)对于此类应用非常有用。现有的FL方法通常不会考虑本地数据标签质量的差异。但是,无处不在的系统中的客户由于对注释者的技能级别,偏见或恶意篡改而导致的标签噪音往往会遭受标签噪音。在本文中,我们建议针对无处不在的系统(Focus)的联合机会计算来应对这一挑战。它在FL服务器上维护了一小部分基准样本,并量化了客户本地数据的可信度,而无需直接观察它们,通过计算本地数据集中FL模型的性能与Benchmark DataSet上客户端局部fl模型之间的相互互入。然后,执行信用加权编排,以根据其信誉值调整FL模型中分配给客户的权重。焦点已在合成数据和现实世界数据上进行了实验评估。结果表明,它有效地标识了具有嘈杂标签的客户,并降低了对模型性能的影响,从而极大地超过了现有的FL方法。
Ubiquitous systems with End-Edge-Cloud architecture are increasingly being used in healthcare applications. Federated Learning (FL) is highly useful for such applications, due to silo effect and privacy preserving. Existing FL approaches generally do not account for disparities in the quality of local data labels. However, the clients in ubiquitous systems tend to suffer from label noise due to varying skill-levels, biases or malicious tampering of the annotators. In this paper, we propose Federated Opportunistic Computing for Ubiquitous Systems (FOCUS) to address this challenge. It maintains a small set of benchmark samples on the FL server and quantifies the credibility of the client local data without directly observing them by computing the mutual cross-entropy between performance of the FL model on the local datasets and that of the client local FL model on the benchmark dataset. Then, a credit weighted orchestration is performed to adjust the weight assigned to clients in the FL model based on their credibility values. FOCUS has been experimentally evaluated on both synthetic data and real-world data. The results show that it effectively identifies clients with noisy labels and reduces their impact on the model performance, thereby significantly outperforming existing FL approaches.