使用最小强迫子集的模型不合时宜的解释

论文标题

使用最小强迫子集的模型不合时宜的解释

Model-Agnostic Explanations using Minimal Forcing Subsets

论文作者

Han, Xing, Ghosh, Joydeep

论文摘要

我们如何找到一部分培训样本，这些样本最负责复杂的黑盒机器学习模型做出的特定预测？更普遍地，我们如何以透明的方式向最终用户解释模型的决定？我们提出了一种新的模型不足算法，以确定一组最小的培训样本集，对于在特定测试点上给定模型的决定是必不可少的，即，在从培训数据集中删除此子集后，该模型的决定将改变。我们的算法通过解决约束优化问题来迭代地迭代地识别这样的“必不可少的”样本。此外，我们通过有效的近似值加快了算法的速度，并为其性能提供了理论上的理由。为了证明我们方法的适用性和有效性，我们将其应用于各种任务，包括数据中毒检测，培训设置调试和理解贷款决策。结果表明，我们的算法是一种有效且易于理解的工具，有助于更好地理解本地模型行为，因此有助于在这种理解是必要的域中采用机器学习。

How can we find a subset of training samples that are most responsible for a specific prediction made by a complex black-box machine learning model? More generally, how can we explain the model's decisions to end-users in a transparent way? We propose a new model-agnostic algorithm to identify a minimal set of training samples that are indispensable for a given model's decision at a particular test point, i.e., the model's decision would have changed upon the removal of this subset from the training dataset. Our algorithm identifies such a set of "indispensable" samples iteratively by solving a constrained optimization problem. Further, we speed up the algorithm through efficient approximations and provide theoretical justification for its performance. To demonstrate the applicability and effectiveness of our approach, we apply it to a variety of tasks including data poisoning detection, training set debugging and understanding loan decisions. The results show that our algorithm is an effective and easy-to-comprehend tool that helps to better understand local model behavior, and therefore facilitates the adoption of machine learning in domains where such understanding is a requisite.

下载PDF全文

下载文献需遵守相关版权规定

论文标题