论文标题
口腔癌检测和解释:深度多个实例学习与传统的深度单个实例学习
Oral cancer detection and interpretation: Deep multiple instance learning versus conventional deep single instance learning
论文作者
论文摘要
目前用于设定口腔癌(OC)诊断的医学标准是对口腔样本的组织学检查。与获取刷子样本随后进行细胞学分析的另一种方法相比,这个过程是耗时的,更具侵入性。但是,熟练的细胞技术医生能够检测由于恶性肿瘤引起的变化,但是,将这种方法引入临床常规与缺乏专家和劳动密集型工作等挑战有关。为了设计一个可信赖的OC检测系统,可以帮助细胞技术医生,我们对基于AI的方法感兴趣,该方法可靠地可靠地检测到仅给定患者标签(最小化注释偏见)的癌症,还提供了有关细胞对诊断最相关的信息(启用监督和理解)。因此,我们使用三种不同的神经网络体系结构进行了传统的单个实例学习(SIL)方法(SIL)方法(SIL)方法(SIL)方法(SIL)方法(SIL)方法(MIL)方法的比较。为了促进对所考虑方法的系统评估,我们引入了一个合成的PAP-QMNIST数据集,该数据集可作为OC数据的模型,同时提供访问每一体的地面真相。我们的研究表明,在Pap-Qmnist上,SIL的表现平均比MIL方法更好。对于两种方法,在实际细胞学数据上的袋子水平的性能相似,但单个实例方法平均表现更好。细胞技术学家的视觉检查表明,这些方法设法鉴定出偏离正常性的细胞,包括恶性细胞以及那些对发育不良的可疑细胞。我们在https://github.com/mida-group/oralcancermilvssils上共享代码作为开源代码
The current medical standard for setting an oral cancer (OC) diagnosis is histological examination of a tissue sample from the oral cavity. This process is time consuming and more invasive than an alternative approach of acquiring a brush sample followed by cytological analysis. Skilled cytotechnologists are able to detect changes due to malignancy, however, to introduce this approach into clinical routine is associated with challenges such as a lack of experts and labour-intensive work. To design a trustworthy OC detection system that would assist cytotechnologists, we are interested in AI-based methods that reliably can detect cancer given only per-patient labels (minimizing annotation bias), and also provide information on which cells are most relevant for the diagnosis (enabling supervision and understanding). We, therefore, perform a comparison of a conventional single instance learning (SIL) approach and a modern multiple instance learning (MIL) method suitable for OC detection and interpretation, utilizing three different neural network architectures. To facilitate systematic evaluation of the considered approaches, we introduce a synthetic PAP-QMNIST dataset, that serves as a model of OC data, while offering access to per-instance ground truth. Our study indicates that on PAP-QMNIST, the SIL performs better, on average, than the MIL approach. Performance at the bag level on real-world cytological data is similar for both methods, yet the single instance approach performs better on average. Visual examination by cytotechnologist indicates that the methods manage to identify cells which deviate from normality, including malignant cells as well as those suspicious for dysplasia. We share the code as open source at https://github.com/MIDA-group/OralCancerMILvsSIL