联合自我监管的对比度学习和皮肤病诊断的蒙面自动编码器

论文标题

联合自我监管的对比度学习和皮肤病诊断的蒙面自动编码器

Federated Self-Supervised Contrastive Learning and Masked Autoencoder for Dermatological Disease Diagnosis

论文作者

Wu, Yawen, Zeng, Dewen, Wang, Zhepeng, Sheng, Yi, Yang, Lei, James, Alaina J., Shi, Yiyu, Hu, Jingtong

论文摘要

在皮肤病学诊断中，移动皮肤病学助理收集的私人数据存在于患者的分布式移动设备上。联合学习（FL）可以使用分散数据来训练模型，同时保持数据本地化。现有的FL方法假定所有数据都有标签。但是，由于高标签成本，医疗数据通常没有完整的标签。自我监督的学习（SSL）方法，对比度学习（CL）和蒙版自动编码器（MAE）可以利用未标记的数据来预先培训模型，然后用有限的标签进行微调。但是，组合SSL和FL有独特的挑战。例如，CL需要不同的数据，但每个设备仅具有有限的数据。对于MAE而言，虽然基于视觉变压器（VIT）的MAE在集中学习中具有更高的准确性，但尚未研究MAE在没有标记数据的FL中的性能。此外，服务器和客户端之间的VIT同步与传统CNN不同。因此，需要设计特殊的同步方法。在这项工作中，我们提出了两个联邦自制的学习框架，用于具有有限标签的皮肤病学诊断。第一个具有较低的计算成本，适用于移动设备。第二个具有高精度，适合高性能服务器。根据CL，我们提出了通过功能共享（FedClf）的联合对比度学习。共享功能可用于不同的对比度信息，而无需共享原始数据以获得隐私。根据MAE，我们提出了Fedmae。知识分裂将所学的全球和本地知识与每个客户分开。只有全球知识才能汇总为更高的概括性能。关于皮肤病学数据集的实验表明，所提出的框架的精度优于最先进的框架。

In dermatological disease diagnosis, the private data collected by mobile dermatology assistants exist on distributed mobile devices of patients. Federated learning (FL) can use decentralized data to train models while keeping data local. Existing FL methods assume all the data have labels. However, medical data often comes without full labels due to high labeling costs. Self-supervised learning (SSL) methods, contrastive learning (CL) and masked autoencoders (MAE), can leverage the unlabeled data to pre-train models, followed by fine-tuning with limited labels. However, combining SSL and FL has unique challenges. For example, CL requires diverse data but each device only has limited data. For MAE, while Vision Transformer (ViT) based MAE has higher accuracy over CNNs in centralized learning, MAE's performance in FL with unlabeled data has not been investigated. Besides, the ViT synchronization between the server and clients is different from traditional CNNs. Therefore, special synchronization methods need to be designed. In this work, we propose two federated self-supervised learning frameworks for dermatological disease diagnosis with limited labels. The first one features lower computation costs, suitable for mobile devices. The second one features high accuracy and fits high-performance servers. Based on CL, we proposed federated contrastive learning with feature sharing (FedCLF). Features are shared for diverse contrastive information without sharing raw data for privacy. Based on MAE, we proposed FedMAE. Knowledge split separates the global and local knowledge learned from each client. Only global knowledge is aggregated for higher generalization performance. Experiments on dermatological disease datasets show superior accuracy of the proposed frameworks over state-of-the-arts.

下载PDF全文

下载文献需遵守相关版权规定

论文标题