使用深层神经网络和对抗性鲁棒性提高fMRI解码的解释性

论文标题

使用深层神经网络和对抗性鲁棒性提高fMRI解码的解释性

Improving the Interpretability of fMRI Decoding using Deep Neural Networks and Adversarial Robustness

论文作者

McClure, Patrick, Moraczewski, Dustin, Lam, Ka Chun, Thomas, Adam, Pereira, Francisco

论文摘要

深度神经网络（DNN）越来越多地用于从功能磁共振成像（fMRI）数据中进行预测。但是，它们被广泛认为是无法解释的“黑匣子”，因为很难发现DNN在此过程中使用了哪些输入信息，这在认知神经科学和临床应用中都很重要。显着图是一种常见的方法，用于产生对预测的输入特征相对重要性的可解释可视化。但是，创建地图的方法通常是由于DNN对输入噪声敏感的，或者通过过多地关注输入和模型过多的方法。评估显着性图与真正相关的输入信息的相对良好的方式也很具有挑战性，因为地面真相并非总是可用。在本文中，我们回顾了各种用于生成基于梯度的显着性图的方法，并提出了一种新的对抗性训练方法，我们开发了以使DNNS稳健地对输入噪声进行稳健，目的是提高可解释性。我们介绍了fMRI中的显着图方法的两个定量评估程序，每当接受DNN或线性模型进行培训以从成像数据中解码一些信息时，适用于fMRI。我们使用合成数据集评估了该过程，其中复杂激活结构是已知的，以及针对人类Connectome项目（HCP）数据集的DNN和线性模型生成的显着图。我们的主要发现是，在合成和HCP fMRI数据中，使用不同方法产生的显着图在可解释性方面差异很大。令人惊讶的是，即使DNN和线性模型以可比的性能水平进行解码，DNN显着性映射在可解释性上得分高于线性模型显着性图（通过权重或梯度得出）。最后，我们的对抗训练方法生成的显着图优于其他方法。

Deep neural networks (DNNs) are being increasingly used to make predictions from functional magnetic resonance imaging (fMRI) data. However, they are widely seen as uninterpretable "black boxes", as it can be difficult to discover what input information is used by the DNN in the process, something important in both cognitive neuroscience and clinical applications. A saliency map is a common approach for producing interpretable visualizations of the relative importance of input features for a prediction. However, methods for creating maps often fail due to DNNs being sensitive to input noise, or by focusing too much on the input and too little on the model. It is also challenging to evaluate how well saliency maps correspond to the truly relevant input information, as ground truth is not always available. In this paper, we review a variety of methods for producing gradient-based saliency maps, and present a new adversarial training method we developed to make DNNs robust to input noise, with the goal of improving interpretability. We introduce two quantitative evaluation procedures for saliency map methods in fMRI, applicable whenever a DNN or linear model is being trained to decode some information from imaging data. We evaluate the procedures using a synthetic dataset where the complex activation structure is known, and on saliency maps produced for DNN and linear models for task decoding in the Human Connectome Project (HCP) dataset. Our key finding is that saliency maps produced with different methods vary widely in interpretability, in both in synthetic and HCP fMRI data. Strikingly, even when DNN and linear models decode at comparable levels of performance, DNN saliency maps score higher on interpretability than linear model saliency maps (derived via weights or gradient). Finally, saliency maps produced with our adversarial training method outperform those from other methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题