在整流梯度和修改的显着图中的输入偏差

论文标题

在整流梯度和修改的显着图中的输入偏差

Input Bias in Rectified Gradients and Modified Saliency Maps

论文作者

Brocki, Lennart, Chung, Neo Christopher

论文摘要

深层神经网络的解释和改善取决于更好地理解其潜在机制。特别是，相对于输入特征（例如，图像中的像素）的类或概念的梯度通常被用作重要性得分或估计量，它们在显着图中可视化。因此，一个显着性方法家族提供了一种直观的方式来识别对分类或潜在概念的实质性影响的输入特征。已经引入了对常规显着图（例如整流梯度和层次相关性传播（LRP））进行的几种修改，以据称是Denoise和提高可解释性的。尽管在某些情况下在视觉上相干，但由于输入特征的不适当用途，由于不适当的用途，整流的梯度和其他修改的显着图引入了强烈的输入偏差（例如，RGB空间中的亮度）。我们证明，即使它与班级或概念相关，我们也不会使用整流梯度来突出输入图像的黑暗区域。即使在缩放图像中，输入偏置也存在于颜色光谱中的人造点。我们的修改只是消除了用输入功能消除乘法，消除了这种偏见。这展示了视觉标准如何与深度学习模型的真实解释性保持一致。

Interpretation and improvement of deep neural networks relies on better understanding of their underlying mechanisms. In particular, gradients of classes or concepts with respect to the input features (e.g., pixels in images) are often used as importance scores or estimators, which are visualized in saliency maps. Thus, a family of saliency methods provide an intuitive way to identify input features with substantial influences on classifications or latent concepts. Several modifications to conventional saliency maps, such as Rectified Gradients and Layer-wise Relevance Propagation (LRP), have been introduced to allegedly denoise and improve interpretability. While visually coherent in certain cases, Rectified Gradients and other modified saliency maps introduce a strong input bias (e.g., brightness in the RGB space) because of inappropriate uses of the input features. We demonstrate that dark areas of an input image are not highlighted by a saliency map using Rectified Gradients, even if it is relevant for the class or concept. Even in the scaled images, the input bias exists around an artificial point in color spectrum. Our modification, which simply eliminates multiplication with input features, removes this bias. This showcases how a visual criteria may not align with true explainability of deep learning models.

下载PDF全文

下载文献需遵守相关版权规定

论文标题