通过神经元分析，基于概念的全球概念解释性

论文标题

通过神经元分析，基于概念的全球概念解释性

Global Concept-Based Interpretability for Graph Neural Networks via Neuron Analysis

论文作者

Xuanyuan, Han, Barbiero, Pietro, Georgiev, Dobrik, Magister, Lucie Charlotte, Lió, Pietro

论文摘要

图形神经网络（GNN）在各种与图形相关的任务上非常有效。但是，它们缺乏解释性和透明度。当前的解释性方法通常是局部的，并且将GNN视为黑盒。他们不在模型内部看，抑制了人类对模型和解释的信任。通过神经元在视觉模型中检测高级语义概念的能力的激励，我们对单个GNN神经元的行为回答有关GNN可解释性的问题进行了新的分析，并提出了新的指标来评估GNN神经元的可解释性。我们提出了一种新的方法，可以使用神经元级的概念为GNN产生全球解释，以使从业者能够对模型具有高级的看法。具体而言，（i）据我们所知，这是第一部作品，表明GNN神经元充当概念探测器，并且与表述为节点学位和邻里属性的逻辑组成的概念具有很强的一致性；（ii）我们定量评估检测概念的重要性，并确定训练持续时间和神经元级别的解释性之间的权衡；（iii）我们证明，我们的全球解释性方法比当前的最新方法具有优势 - 我们可以将解释解释为以逻辑描述为支持的单个可解释概念，从而降低了偏见的潜力并改善用户友好性。

Graph neural networks (GNNs) are highly effective on a variety of graph-related tasks; however, they lack interpretability and transparency. Current explainability approaches are typically local and treat GNNs as black-boxes. They do not look inside the model, inhibiting human trust in the model and explanations. Motivated by the ability of neurons to detect high-level semantic concepts in vision models, we perform a novel analysis on the behaviour of individual GNN neurons to answer questions about GNN interpretability, and propose new metrics for evaluating the interpretability of GNN neurons. We propose a novel approach for producing global explanations for GNNs using neuron-level concepts to enable practitioners to have a high-level view of the model. Specifically, (i) to the best of our knowledge, this is the first work which shows that GNN neurons act as concept detectors and have strong alignment with concepts formulated as logical compositions of node degree and neighbourhood properties; (ii) we quantitatively assess the importance of detected concepts, and identify a trade-off between training duration and neuron-level interpretability; (iii) we demonstrate that our global explainability approach has advantages over the current state-of-the-art -- we can disentangle the explanation into individual interpretable concepts backed by logical descriptions, which reduces potential for bias and improves user-friendliness.

下载PDF全文

下载文献需遵守相关版权规定

论文标题