MEDFUSE：与临床时间序列数据和胸部X射线图像的多模式融合

论文标题

MEDFUSE：与临床时间序列数据和胸部X射线图像的多模式融合

MedFuse: Multi-modal fusion with clinical time-series data and chest X-ray images

论文作者

Hayat, Nasir, Geras, Krzysztof J., Shamout, Farah E.

论文摘要

多模式融合方法旨在整合来自不同数据源的信息。与天然数据集不同，例如在视听应用中，样品由“配对”模式组成，医疗保健中的数据通常是异步收集的。因此，对于给定样品需要所有方式，对于临床任务而言并不现实，并且在训练过程中显着限制了数据集的大小。在本文中，我们提出了Medfuse，这是一种概念上简单但有希望的基于LSTM的融合模块，可以容纳Uni-Mododal和多模式输入。我们使用MIMIC-IV数据集中的临床时间序列数据以及Mimic-CXR中的相应的胸部X射线图像，评估了融合方法，并引入了院内死亡率预测和表型分类的新基准结果。与更复杂的多模式融合策略相比，MedFuse在完全配对的测试集上的差距很大。它在部分配对的测试集中还保持了强大的稳定性，其中包含缺失胸部X射线图像的样品。我们发布了我们的可再现性代码，并在将来对竞争模型进行评估。

Multi-modal fusion approaches aim to integrate information from different data sources. Unlike natural datasets, such as in audio-visual applications, where samples consist of "paired" modalities, data in healthcare is often collected asynchronously. Hence, requiring the presence of all modalities for a given sample is not realistic for clinical tasks and significantly limits the size of the dataset during training. In this paper, we propose MedFuse, a conceptually simple yet promising LSTM-based fusion module that can accommodate uni-modal as well as multi-modal input. We evaluate the fusion method and introduce new benchmark results for in-hospital mortality prediction and phenotype classification, using clinical time-series data in the MIMIC-IV dataset and corresponding chest X-ray images in MIMIC-CXR. Compared to more complex multi-modal fusion strategies, MedFuse provides a performance improvement by a large margin on the fully paired test set. It also remains robust across the partially paired test set containing samples with missing chest X-ray images. We release our code for reproducibility and to enable the evaluation of competing models in the future.

下载PDF全文

下载文献需遵守相关版权规定

论文标题