关于预测性多样性下的反事实解释

论文标题

关于预测性多样性下的反事实解释

On Counterfactual Explanations under Predictive Multiplicity

论文作者

Pawelczyk, Martin, Broelemann, Klaus, Kasneci, Gjergji

论文摘要

反事实解释通常是通过确定输入的最小变化来改变固定模型做出的预测（以下称为稀疏方法）。然而，最近的工作使一个旧的见解恢复了活力：相对于普遍使用的感兴趣量度（例如，错误率），通常不存在针对预测问题的优越解决方案。实际上，通常多个不同的分类器提供几乎平等的解决方案。这种现象被称为预测性多样性（Breiman，2001； Marx等，2019）。在这项工作中，我们为预测性多样性下的反事实解释的成本提供了一般的上限。最值得注意的是，这取决于两个分类器之间的差异概念，这描述了他们对待负面预测的个人的不同。然后，我们在现实世界中比较稀疏和数据支持方法。结果表明，数据支持方法对多种模型更为强大。同时，我们表明这些方法在一个固定模型下产生反事实解释的成本较高。总而言之，我们的理论和经验结果挑战了通常认为反事实建议通常应该很少的观点。

Counterfactual explanations are usually obtained by identifying the smallest change made to an input to change a prediction made by a fixed model (hereafter called sparse methods). Recent work, however, has revitalized an old insight: there often does not exist one superior solution to a prediction problem with respect to commonly used measures of interest (e.g. error rate). In fact, often multiple different classifiers give almost equal solutions. This phenomenon is known as predictive multiplicity (Breiman, 2001; Marx et al., 2019). In this work, we derive a general upper bound for the costs of counterfactual explanations under predictive multiplicity. Most notably, it depends on a discrepancy notion between two classifiers, which describes how differently they treat negatively predicted individuals. We then compare sparse and data support approaches empirically on real-world data. The results show that data support methods are more robust to multiplicity of different models. At the same time, we show that those methods have provably higher cost of generating counterfactual explanations under one fixed model. In summary, our theoretical and empiricaln results challenge the commonly held view that counterfactual recommendations should be sparse in general.

下载PDF全文

下载文献需遵守相关版权规定

论文标题