论文标题

螺纹针对沙普利说明

Threading the Needle of On and Off-Manifold Value Functions for Shapley Explanations

论文作者

Yeh, Chih-Kuan, Lee, Kuan-Yun, Liu, Frederick, Ravikumar, Pradeep

论文摘要

量化给定模型的特征重要性的一种流行的可解释的AI(XAI)方法是通过Shapley值。这些沙普利值在合作游戏中产生,因此在XAI上下文中计算这些值的关键要素是所谓的价值函数,它计算功能子集的“值”,并将机器学习模型与合作游戏联系起来。对于此类价值函数,有许多可能的选择,这些选择大致分为两类:on-manifold和off-manifold值函数,分别采用观察性和介入的观点。然而,这两个类都有各自的缺陷,其中manifold值函数违反了关键的公理属性,并且在计算上又昂贵,而off-manifold值函数对数据歧管的关注较少,并在未经训练的区域上评估了模型。因此,在使用哪种类别的值函数上尚无共识。在本文中,我们表明,除了这些现有问题外,两类的价值功能都容易在低密度区域进行对抗操作。 We formalize the desiderata of value functions that respect both the model and the data manifold in a set of axioms and are robust to perturbation on off-manifold regions, and show that there exists a unique value function that satisfies these axioms, which we term the Joint Baseline value function, and the resulting Shapley value the Joint Baseline Shapley (JBshap), and validate the effectiveness of JBshap in experiments.

A popular explainable AI (XAI) approach to quantify feature importance of a given model is via Shapley values. These Shapley values arose in cooperative games, and hence a critical ingredient to compute these in an XAI context is a so-called value function, that computes the "value" of a subset of features, and which connects machine learning models to cooperative games. There are many possible choices for such value functions, which broadly fall into two categories: on-manifold and off-manifold value functions, which take an observational and an interventional viewpoint respectively. Both these classes however have their respective flaws, where on-manifold value functions violate key axiomatic properties and are computationally expensive, while off-manifold value functions pay less heed to the data manifold and evaluate the model on regions for which it wasn't trained. Thus, there is no consensus on which class of value functions to use. In this paper, we show that in addition to these existing issues, both classes of value functions are prone to adversarial manipulations on low density regions. We formalize the desiderata of value functions that respect both the model and the data manifold in a set of axioms and are robust to perturbation on off-manifold regions, and show that there exists a unique value function that satisfies these axioms, which we term the Joint Baseline value function, and the resulting Shapley value the Joint Baseline Shapley (JBshap), and validate the effectiveness of JBshap in experiments.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源