论文标题

trex:树填充的代表点解释

TREX: Tree-Ensemble Representer-Point Explanations

论文作者

Brophy, Jonathan, Lowd, Daniel

论文摘要

我们如何确定培训示例最大程度地促进了树合奏的预测?在本文中,我们介绍了Trex,这是一个解释系统,可为树的构造提供实例 - 属性解释,例如随机森林和梯度增强的树木。 Trex建立在以前开发用于解释深神经网络的代表点框架的基础上。由于树的合奏是不可差异的,因此我们定义一个捕获特定树合奏结构的内核。通过在内核逻辑回归或支持向量机中使用此内核,Trex构建了近似原始树集合的替代模型。替代模型内核扩展中的权重用于定义每个培训示例的全球或局部重要性。 我们的实验表明,Trex的替代模型准确地近似树的整体。它的全球重要性权重在数据集调试中比以前的最先进的重量更有效。它的解释比删除和再培训框架下的替代方法更好地确定了最有影响力的样本。它的数量级比替代方法快。它的本地解释可以识别和解释由于域不匹配而引起的错误。

How can we identify the training examples that contribute most to the prediction of a tree ensemble? In this paper, we introduce TREX, an explanation system that provides instance-attribution explanations for tree ensembles, such as random forests and gradient boosted trees. TREX builds on the representer point framework previously developed for explaining deep neural networks. Since tree ensembles are non-differentiable, we define a kernel that captures the structure of the specific tree ensemble. By using this kernel in kernel logistic regression or a support vector machine, TREX builds a surrogate model that approximates the original tree ensemble. The weights in the kernel expansion of the surrogate model are used to define the global or local importance of each training example. Our experiments show that TREX's surrogate model accurately approximates the tree ensemble; its global importance weights are more effective in dataset debugging than the previous state-of-the-art; its explanations identify the most influential samples better than alternative methods under the remove and retrain evaluation framework; it runs orders of magnitude faster than alternative methods; and its local explanations can identify and explain errors due to domain mismatch.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源