通过解码层激活来解释神经网络

论文标题

通过解码层激活来解释神经网络

Explaining Neural Networks by Decoding Layer Activations

论文作者

Schneider, Johannes, Vlachos, Michalis

论文摘要

我们提出了一个“分类器”架构（\ emph {cladec}），该体系结构促进了神经网络（NN）中任意层的输出的理解。它使用解码器将给定层的非解剖表示形式转换为与人类熟悉的域更相似的表示形式。在图像识别问题中，可以通过将\ emph {cladec}的重建图像与传统自动编码器（AE）的重构图像进行对比，以识别层表示的信息。我们还扩展了\ emph {cladec}，以允许人类的解释性和忠诚度之间的权衡。我们使用卷积NNS评估了图像分类的方法。我们表明，使用分类器中的编码捕获比常规AE的更相关的信息进行重构可视化。相关代码可在\ url {https://github.com/johntailor/cladec}中获得

We present a `CLAssifier-DECoder' architecture (\emph{ClaDec}) which facilitates the comprehension of the output of an arbitrary layer in a neural network (NN). It uses a decoder to transform the non-interpretable representation of the given layer to a representation that is more similar to the domain a human is familiar with. In an image recognition problem, one can recognize what information is represented by a layer by contrasting reconstructed images of \emph{ClaDec} with those of a conventional auto-encoder(AE) serving as reference. We also extend \emph{ClaDec} to allow the trade-off between human interpretability and fidelity. We evaluate our approach for image classification using Convolutional NNs. We show that reconstructed visualizations using encodings from a classifier capture more relevant information for classification than conventional AEs. Relevant code is available at \url{https://github.com/JohnTailor/ClaDec}

下载PDF全文

下载文献需遵守相关版权规定

论文标题