论文标题
用对抗掩蔽预处理ECG数据可改善模型的数据筛选任务的通用性
Pretraining ECG Data with Adversarial Masking Improves Model Generalizability for Data-Scarce Tasks
论文作者
论文摘要
医疗数据集经常面临数据稀缺问题,因为必须由医疗专业人员生成地面真相标签。一种缓解策略是在具有自我监督学习(SSL)的大型,未标记的数据集上预先学习深度学习模型。数据增强对于改善SSL训练模型的普遍性至关重要,但通常是手动制作和调整的。我们使用对抗模型来生成面具作为12铅心电图(ECG)数据的增强,其中掩模学会学会遮住ECG的诊断区域。与随机增强相比,当转移到两个不同的下游目标时,对抗性掩蔽具有更好的准确性:心律不齐分类和性别分类。与最先进的心电图增强方法3kg相比,对抗掩蔽在数据筛选方面的表现更好,证明了我们的模型的普遍性。
Medical datasets often face the problem of data scarcity, as ground truth labels must be generated by medical professionals. One mitigation strategy is to pretrain deep learning models on large, unlabelled datasets with self-supervised learning (SSL). Data augmentations are essential for improving the generalizability of SSL-trained models, but they are typically handcrafted and tuned manually. We use an adversarial model to generate masks as augmentations for 12-lead electrocardiogram (ECG) data, where masks learn to occlude diagnostically-relevant regions of the ECGs. Compared to random augmentations, adversarial masking reaches better accuracy when transferring to to two diverse downstream objectives: arrhythmia classification and gender classification. Compared to a state-of-art ECG augmentation method 3KG, adversarial masking performs better in data-scarce regimes, demonstrating the generalizability of our model.