固有的特征重要不一致

论文标题

固有的特征重要不一致

Inherent Inconsistencies of Feature Importance

论文作者

Harel, Nimrod, Obolski, Uri, Gilad-Bachrach, Ran

论文摘要

机器学习驱动的技术的快速发展和广泛采用凸显了创建可解释的人工智能系统的实用和道德需求。特征重要性是一种为单个特征对预测结果的贡献分配得分的方法，旨在弥合这一差距，以作为增强人类对这些系统理解的工具。功能重要性是对不同环境中预测的解释，无论是通过在整个数据集中提供对现象的全球解释，还是通过为特定数据点的结果提供局部解释。此外，特征的重要性既用于解释模型和识别数据中合理的因果关系，而不是独立于模型。但是，值得注意的是，传统上孤立地探索了这些各种环境，理论基础有限。本文提出了一个公理框架，旨在在特征重要分数的不同上下文之间建立连贯的关系。值得注意的是，我们的作品揭示了一个令人惊讶的结论：当我们将提出的属性与文献中先前概述的属性结合在一起时，我们证明了不一致的存在。这种不一致的意见表明，特征重要性得分的某些基本属性不能在单个框架中和谐地共存。

The rapid advancement and widespread adoption of machine learning-driven technologies have underscored the practical and ethical need for creating interpretable artificial intelligence systems. Feature importance, a method that assigns scores to the contribution of individual features on prediction outcomes, seeks to bridge this gap as a tool for enhancing human comprehension of these systems. Feature importance serves as an explanation of predictions in diverse contexts, whether by providing a global interpretation of a phenomenon across the entire dataset or by offering a localized explanation for the outcome of a specific data point. Furthermore, feature importance is being used both for explaining models and for identifying plausible causal relations in the data, independently from the model. However, it is worth noting that these various contexts have traditionally been explored in isolation, with limited theoretical foundations. This paper presents an axiomatic framework designed to establish coherent relationships among the different contexts of feature importance scores. Notably, our work unveils a surprising conclusion: when we combine the proposed properties with those previously outlined in the literature, we demonstrate the existence of an inconsistency. This inconsistency highlights that certain essential properties of feature importance scores cannot coexist harmoniously within a single framework.

下载PDF全文

下载文献需遵守相关版权规定

论文标题