使用概念激活向量的基于深度学习的皮肤病变分类器的解释性

论文标题

使用概念激活向量的基于深度学习的皮肤病变分类器的解释性

On Interpretability of Deep Learning based Skin Lesion Classifiers using Concept Activation Vectors

论文作者

Lucieri, Adriano, Bajwa, Muhammad Naseer, Braun, Stephan Alexander, Malik, Muhammad Imran, Dengel, Andreas, Ahmed, Sheraz

论文摘要

基于深度学习的医学图像分类器在眼科，皮肤病学，病理学和放射学等各种应用领域表现出了显着的实力。但是，在实际临床设置中接受这些计算机辅助诊断系统（CAD）系统的接受程度受到严重限制，主要是因为它们的决策过程在很大程度上仍然是晦涩的。这项工作旨在通过验证模型学习并利用与皮肤科医生所描述和使用的类似疾病相关的概念来阐明基于深度学习的医学图像分类器。我们使用了通过推理进行复杂数据（RECOD）实验室开发的良好训练和高性能的神经网络来对三种皮肤肿瘤进行分类，即黑素细胞NAEVI，黑色素瘤和脂肪性角化病，并对其潜在空间进行了详细的分析。两个已建立良好且可公开的皮肤病数据集PH2和DERM7PT用于实验。人类可以理解的概念被映射到借助概念激活载体（CAVS）的帮助编制图像分类模型，从而引入了新颖的训练和对CAVS的显着性测试范式。我们对独立评估集的结果清楚地表明，分类器在其潜在表示中学习并编码人类可以理解的概念。此外，TCAV得分（对CAVS进行测试）表明，在做出预测时，神经网络确实以正确的方式使用了与疾病相关的概念。我们预计，这项工作不仅可以增加医生对CAD的信心，而且还可以作为进一步开发基于CAV的神经网络解释方法的垫脚石。

Deep learning based medical image classifiers have shown remarkable prowess in various application areas like ophthalmology, dermatology, pathology, and radiology. However, the acceptance of these Computer-Aided Diagnosis (CAD) systems in real clinical setups is severely limited primarily because their decision-making process remains largely obscure. This work aims at elucidating a deep learning based medical image classifier by verifying that the model learns and utilizes similar disease-related concepts as described and employed by dermatologists. We used a well-trained and high performing neural network developed by REasoning for COmplex Data (RECOD) Lab for classification of three skin tumours, i.e. Melanocytic Naevi, Melanoma and Seborrheic Keratosis and performed a detailed analysis on its latent space. Two well established and publicly available skin disease datasets, PH2 and derm7pt, are used for experimentation. Human understandable concepts are mapped to RECOD image classification model with the help of Concept Activation Vectors (CAVs), introducing a novel training and significance testing paradigm for CAVs. Our results on an independent evaluation set clearly shows that the classifier learns and encodes human understandable concepts in its latent representation. Additionally, TCAV scores (Testing with CAVs) suggest that the neural network indeed makes use of disease-related concepts in the correct way when making predictions. We anticipate that this work can not only increase confidence of medical practitioners on CAD but also serve as a stepping stone for further development of CAV-based neural network interpretation methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题