论文标题
FedNorm:多模式肝分段的联合学习中基于模态的归一化
FedNorm: Modality-Based Normalization in Federated Learning for Multi-Modal Liver Segmentation
论文作者
论文摘要
鉴于肝病的发病率和有效的治疗选择,它们具有极大的社会经济重要性。分析CT和MRI图像进行诊断和随访治疗的最常见方法之一是肝分割。深度学习的最新进展证明了自动肝分段的令人鼓舞的结果。尽管如此,他们的成功主要取决于注释数据库的可用性,该数据库通常由于隐私问题而无法获得。最近已提出联邦学习作为解决这些挑战的一种解决方案,通过训练在分布式客户的共享全球模型而无需访问其本地数据库的情况下。然而,由于多模式成像(例如CT和MRI)以及多种扫描仪类型,在对图像数据的高度异质性和多种扫描仪类型的高度异质性训练时,联合学习的表现不佳。为此,我们提出了FedNorm及其扩展\ Fednormp,这是两种使用基于模态归一化技术的联合学习算法。具体而言,FedNorm在客户层上的功能归一化,而FedNorm+在功能归一化中采用单个切片的模态信息。我们的方法使用来自六个公开数据库的428名患者进行了验证,并将其与最新的联邦学习算法和基线模型(多机构的,多模式数据)进行了比较。实验结果表明,我们的方法显示了总体可接受的性能,每位患者得分的骰子达到0.961,始终超过了当地训练的模型,并且比集中式模型略高。
Given the high incidence and effective treatment options for liver diseases, they are of great socioeconomic importance. One of the most common methods for analyzing CT and MRI images for diagnosis and follow-up treatment is liver segmentation. Recent advances in deep learning have demonstrated encouraging results for automatic liver segmentation. Despite this, their success depends primarily on the availability of an annotated database, which is often not available because of privacy concerns. Federated Learning has been recently proposed as a solution to alleviate these challenges by training a shared global model on distributed clients without access to their local databases. Nevertheless, Federated Learning does not perform well when it is trained on a high degree of heterogeneity of image data due to multi-modal imaging, such as CT and MRI, and multiple scanner types. To this end, we propose Fednorm and its extension \fednormp, two Federated Learning algorithms that use a modality-based normalization technique. Specifically, Fednorm normalizes the features on a client-level, while Fednorm+ employs the modality information of single slices in the feature normalization. Our methods were validated using 428 patients from six publicly available databases and compared to state-of-the-art Federated Learning algorithms and baseline models in heterogeneous settings (multi-institutional, multi-modal data). The experimental results demonstrate that our methods show an overall acceptable performance, achieve Dice per patient scores up to 0.961, consistently outperform locally trained models, and are on par or slightly better than centralized models.