论文标题
SD-Layernet:使用解剖学先验的分解表示,在OCT中进行半监督视网膜层分割
SD-LayerNet: Semi-supervised retinal layer segmentation in OCT using disentangled representation with anatomical priors
论文作者
论文摘要
光学相干断层扫描(OCT)是一种非侵入性的3D形式,广泛用于视网膜的眼科。在OCT上实现自动化的解剖学视网膜层分割对于检测和监测不同视网膜疾病(如年龄相关的黄斑病(AMD)或糖尿病性视网膜病)很重要。但是,大多数最先进的层分割方法基于纯监督的深度学习,需要大量昂贵且难以获得的像素级注释数据。考虑到这一点,我们将半监督的范式引入了视网膜层分割任务,该范式利用了大规模未标记数据集和解剖学先验中存在的信息。特别是,一种新型的完全可区分的方法用于将表面位置回归转换为像素结构化分割,从而使以耦合方式同时使用1D表面和2D层表示来训练模型。特别是,这些2D分割被用作解剖因素,与学习的样式因子一起组成了用于重建输入图像的分离表示。同时,当有限的标记数据可用时,我们建议一组解剖学先验,以改善网络训练。我们在使用中间体和湿式AMD的扫描现实数据集上证明了我们的方法在使用我们的完整训练集时的表现要优于最先进的扫描,但更重要的是,当使用标记的数据的一小部分进行培训时,我们的方法在很大程度上超过了最新的训练。
Optical coherence tomography (OCT) is a non-invasive 3D modality widely used in ophthalmology for imaging the retina. Achieving automated, anatomically coherent retinal layer segmentation on OCT is important for the detection and monitoring of different retinal diseases, like Age-related Macular Disease (AMD) or Diabetic Retinopathy. However, the majority of state-of-the-art layer segmentation methods are based on purely supervised deep-learning, requiring a large amount of pixel-level annotated data that is expensive and hard to obtain. With this in mind, we introduce a semi-supervised paradigm into the retinal layer segmentation task that makes use of the information present in large-scale unlabeled datasets as well as anatomical priors. In particular, a novel fully differentiable approach is used for converting surface position regression into a pixel-wise structured segmentation, allowing to use both 1D surface and 2D layer representations in a coupled fashion to train the model. In particular, these 2D segmentations are used as anatomical factors that, together with learned style factors, compose disentangled representations used for reconstructing the input image. In parallel, we propose a set of anatomical priors to improve network training when a limited amount of labeled data is available. We demonstrate on the real-world dataset of scans with intermediate and wet-AMD that our method outperforms state-of-the-art when using our full training set, but more importantly largely exceeds state-of-the-art when it is trained with a fraction of the labeled data.