在医学成像中回答视觉问题的以问题为中心的模型

论文标题

在医学成像中回答视觉问题的以问题为中心的模型

A Question-Centric Model for Visual Question Answering in Medical Imaging

论文作者

Vu, Minh H., Löfstedt, Tommy, Nyholm, Tufve, Sznitman, Raphael

论文摘要

深度学习方法已被证明在执行各种医学图像分析任务方面非常有效。然而，由于它们在临床常规中的潜在用途，他们缺乏透明度一直是他们为数不多的弱点之一，这引起了人们对其行为和失败模式的担忧。尽管大多数用于推断模型行为的研究都集中在间接策略上，这些策略估算了预测不确定性并在输入图像空间中可视化模型支持，但可以明确查询有关其图像内容的预测模型的能力，提供了一种更直接的方法来确定训练有素的模型的行为。为此，我们提出了一种新颖的视觉问题回答方法，该方法可以通过书面问题查询图像。对各种医学和自然图像数据集进行的实验表明，通过以新颖的方式融合图像和问题特征，与当前方法相比，所提出的方法可以达到相等或更高的精度。

Deep learning methods have proven extremely effective at performing a variety of medical image analysis tasks. With their potential use in clinical routine, their lack of transparency has however been one of their few weak points, raising concerns regarding their behavior and failure modes. While most research to infer model behavior has focused on indirect strategies that estimate prediction uncertainties and visualize model support in the input image space, the ability to explicitly query a prediction model regarding its image content offers a more direct way to determine the behavior of trained models. To this end, we present a novel Visual Question Answering approach that allows an image to be queried by means of a written question. Experiments on a variety of medical and natural image datasets show that by fusing image and question features in a novel way, the proposed approach achieves an equal or higher accuracy compared to current methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题