论文标题
基于生成对抗网络的合成学习和脊柱X光片的新型域相关损耗项
Generative Adversarial Network Based Synthetic Learning and a Novel Domain Relevant Loss Term for Spine Radiographs
论文作者
论文摘要
问题:缺乏用于培训医学深度学习模型的大数据,其特征是数据收集和隐私问题的时间成本。生成的对抗网络(GAN)既具有生成新数据的潜力,也提供了将这些新生成的数据(而不包含患者的真实数据)用于下游应用的潜力。 方法:对一系列gan进行了训练,并应用了下游计算机视觉脊柱X光异常分类任务。单独的分类器经过访问或无法访问原始成像的培训。受过训练的甘斯(Gans)包括带有自适应判别器增强剂的条件样式GAN2,这是一种带有自适应鉴别器增强剂的有条件式stylegan2,以生成脊柱X光片的条件,以病变类型为条件,并使用新颖的临床损失术语为具有适应性歧视器增强的stylegan2在异常(Spinegan)上有条件(脊柱刺激)。最后,对具有自适应歧视器增强的差异隐私施加了thelegan2,培训了有条件的,并对其差异隐私征收进行了消融研究。 关键结果:我们在文献综述中首次完成了无意义输入的合成脊柱X光片的产生。我们进一步证明了通过下游临床分类任务对脊柱结构域的合成学习成功(使用合成数据的AUC为0.830,而使用真实数据为0.886相比)。重要的是,发现发电机引入新的临床损失期限会增加发电召回和加速模型训练。最后,我们证明,在有限的大型医疗数据集中,差异隐私会严重阻碍gan训练,发现这是特别是由于对逐渐扰动噪声的要求。
Problem: There is a lack of big data for the training of deep learning models in medicine, characterized by the time cost of data collection and privacy concerns. Generative adversarial networks (GANs) offer both the potential to generate new data, as well as to use this newly generated data, without inclusion of patients' real data, for downstream applications. Approach: A series of GANs were trained and applied for a downstream computer vision spine radiograph abnormality classification task. Separate classifiers were trained with either access or no access to the original imaging. Trained GANs included a conditional StyleGAN2 with adaptive discriminator augmentation, a conditional StyleGAN2 with adaptive discriminator augmentation to generate spine radiographs conditional on lesion type, and using a novel clinical loss term for the generator a StyleGAN2 with adaptive discriminator augmentation conditional on abnormality (SpineGAN). Finally, a differential privacy imposed StyleGAN2 with adaptive discriminator augmentation conditional on abnormality was trained and an ablation study was performed on its differential privacy impositions. Key Results: We accomplish GAN generation of synthetic spine radiographs without meaningful input for the first time from a literature review. We further demonstrate the success of synthetic learning for the spine domain with a downstream clinical classification task (AUC of 0.830 using synthetic data compared to AUC of 0.886 using the real data). Importantly, the introduction of a new clinical loss term for the generator was found to increase generation recall as well as accelerate model training. Lastly, we demonstrate that, in a limited size medical dataset, differential privacy impositions severely impede GAN training, finding that this is specifically due to the requirement for gradient perturbation with noise.