图形卷积神经网络的共同信息估计

论文标题

图形卷积神经网络的共同信息估计

Mutual information estimation for graph convolutional neural networks

论文作者

Landverk, Marius C., Riemer-Sørensen, Signe

论文摘要

测量模型性能是深度学习从业者的关键问题。但是，我们通常缺乏解释为什么特定体系结构获得给定数据集的卓越预测准确性的能力。通常，验证精度被用作性能启发式，量化网络概括为看不见的数据，但并未捕获有关模型中信息流的任何内容。共同信息可以用作深度学习模型中内部表示质量的衡量，并且信息平面可以提供有关该模型是否利用数据中可用信息的见解。以前已经探索了完全连接的神经网络和卷积架构的信息平面。我们提出了一种体系结构 - 不合SNOSTIC方法，用于在培训过程中跟踪网络的内部表示形式，然后将其用于创建相互信息平面。该方法的例证是针对引用数据上的基于图的神经网络的例证。我们比较基于图的架构中引入的电感偏差如何将相互信息平面更改为完全连接的神经网络。

Measuring model performance is a key issue for deep learning practitioners. However, we often lack the ability to explain why a specific architecture attains superior predictive accuracy for a given data set. Often, validation accuracy is used as a performance heuristic quantifying how well a network generalizes to unseen data, but it does not capture anything about the information flow in the model. Mutual information can be used as a measure of the quality of internal representations in deep learning models, and the information plane may provide insights into whether the model exploits the available information in the data. The information plane has previously been explored for fully connected neural networks and convolutional architectures. We present an architecture-agnostic method for tracking a network's internal representations during training, which are then used to create the mutual information plane. The method is exemplified for graph-based neural networks fitted on citation data. We compare how the inductive bias introduced in graph-based architectures changes the mutual information plane relative to a fully connected neural network.

下载PDF全文

下载文献需遵守相关版权规定

论文标题