费德：通过经验重播和保护隐私数据综合的联合学习

论文标题

费德：通过经验重播和保护隐私数据综合的联合学习

FedER: Federated Learning through Experience Replay and Privacy-Preserving Data Synthesis

论文作者

Pennisi, Matteo, Salanitri, Federica Proietto, Bellitto, Giovanni, Casella, Bruno, Aldinucci, Marco, Palazzo, Simone, Spampinato, Concetto

论文摘要

在医疗领域，通常寻求多中心协作来通过利用患者和临床数据的异质性来产生更广泛的发现。但是，最近的隐私法规阻碍了共享数据的可能性，因此，提出了支持诊断和预后的基于机器学习的解决方案。联合学习（FL）旨在通过将基于AI的解决方案带给数据所有者，而仅共享需要汇总的本地AI模型或其部分。但是，大多数现有的联合学习解决方案仍处于起步阶段，并且由于缺乏可靠且有效的聚合计划能够保留本地学到的知识，从而显示出薄弱的隐私保护，因为可以从模型更新中重建实际数据，因此显示出几个缺点。此外，这些方法中的大多数，尤其是那些处理医学数据的方法，都依赖于构成鲁棒性，可伸缩性和信任问题的集中分布式学习策略。在本文中，我们提出了一种联合和分散的学习策略，费德尔（Feder），利用经验重播和生成的对抗性概念，有效地整合了本地节点的功能，从而提供了能够在维持隐私的同时跨多个数据集概括的模型。为了模拟现实的非I.I.D，Feder对两项任务进行了两项任务：结核病和黑色素瘤分类。医疗数据方案。结果表明，我们的方法实现了与标准（未赋予）学习相当的性能，并且在其集中式（因此更有利的）配方中的最先进的联合方法效果明显优于最先进的联合方法。代码可从https://github.com/perceivelab/feder获得

In the medical field, multi-center collaborations are often sought to yield more generalizable findings by leveraging the heterogeneity of patient and clinical data. However, recent privacy regulations hinder the possibility to share data, and consequently, to come up with machine learning-based solutions that support diagnosis and prognosis. Federated learning (FL) aims at sidestepping this limitation by bringing AI-based solutions to data owners and only sharing local AI models, or parts thereof, that need then to be aggregated. However, most of the existing federated learning solutions are still at their infancy and show several shortcomings, from the lack of a reliable and effective aggregation scheme able to retain the knowledge learned locally to weak privacy preservation as real data may be reconstructed from model updates. Furthermore, the majority of these approaches, especially those dealing with medical data, relies on a centralized distributed learning strategy that poses robustness, scalability and trust issues. In this paper we present a federated and decentralized learning strategy, FedER, that, exploiting experience replay and generative adversarial concepts, effectively integrates features from local nodes, providing models able to generalize across multiple datasets while maintaining privacy. FedER is tested on two tasks -- tuberculosis and melanoma classification -- using multiple datasets in order to simulate realistic non-i.i.d. medical data scenarios. Results show that our approach achieves performance comparable to standard (non-federated) learning and significantly outperforms state-of-the-art federated methods in their centralized (thus, more favourable) formulation. Code is available at https://github.com/perceivelab/FedER

下载PDF全文

下载文献需遵守相关版权规定

论文标题