联合的多语言模型用于医疗笔录分析

论文标题

联合的多语言模型用于医疗笔录分析

Federated Multilingual Models for Medical Transcript Analysis

论文作者

Manoel, Andre, Garcia, Mirian Hipolito, Baumel, Tal, Su, Shize, Chen, Jialei, Miller, Dan, Karmon, Danny, Sim, Robert, Dimitriadis, Dimitrios

论文摘要

联合学习（FL）是一种新颖的机器学习方法，它使模型培训师可以通过跨多个分散数据源训练模型来访问更多数据样本，而数据访问限制则存在。这种训练有素的模型可以实现高于在单个数据源上训练时所能完成的工作的明显更高的性能。作为FL承诺的一部分，没有任何培训数据传输到任何中心位置，从而确保敏感数据仍然是本地和私人的。这些特征使得FL非常适合医疗保健中的大规模应用，在这种情况下，各种合规性约束限制了如何处理，处理和存储数据的方式。尽管联邦学习显然有好处，但本地数据分布的异质性构成了重大挑战，在多语言数据提供商的情况下，这种挑战更加明显。在本文中，我们提出了一个联合学习系统，用于培训适合在下游任务（例如医疗实体标签）进行微调的大规模多语言模型。我们的工作代表了第一个这样的生产规模系统之一，能够跨多个高度异构数据提供商进行培训，并且可以通过将中央培训与公共数据一起使用中央培训来实现否则就无法实现的准确性水平。最后，我们表明，通过本地执行的训练步骤可以进一步提高全球模型性能。

Federated Learning (FL) is a novel machine learning approach that allows the model trainer to access more data samples, by training the model across multiple decentralized data sources, while data access constraints are in place. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. As part of FL's promises, none of the training data is ever transmitted to any central location, ensuring that sensitive data remains local and private. These characteristics make FL perfectly suited for large-scale applications in healthcare, where a variety of compliance constraints restrict how data may be handled, processed, and stored. Despite the apparent benefits of federated learning, the heterogeneity in the local data distributions pose significant challenges, and such challenges are even more pronounced in the case of multilingual data providers. In this paper we present a federated learning system for training a large-scale multi-lingual model suitable for fine-tuning on downstream tasks such as medical entity tagging. Our work represents one of the first such production-scale systems, capable of training across multiple highly heterogeneous data providers, and achieving levels of accuracy that could not be otherwise achieved by using central training with public data. Finally, we show that the global model performance can be further improved by a training step performed locally.

下载PDF全文

下载文献需遵守相关版权规定

论文标题