论文标题
Asteryx:一种基于符号和分数解释的基于模型的SAT方法
ASTERYX : A model-Agnostic SaT-basEd appRoach for sYmbolic and score-based eXplanations
论文作者
论文摘要
机器学习技术在实践中越来越多地使用了越来越多的复杂性,因此需要解释这些模型的预测和决策,通常用作黑盒。可解释的AI方法是基于数值的特征,旨在量化每个功能在预测或符号中提供某些形式的符号解释(例如反事实)的贡献。本文提出了一种名为asteryx的通用不可知论方法,允许同时产生符号解释和基于分数的解释。我们的方法是声明性的,它基于在等效符号表示中进行解释的模型的编码,后者用于在特定两种类型的符号解释中生成,这是足够的原因和反事实。然后,我们将反映解释的相关性和特征W.R.T的相关性与某些属性相关联。我们的实验结果表明,所提出的方法的可行性及其在提供符号和基于得分的解释方面的有效性。
The ever increasing complexity of machine learning techniques used more and more in practice, gives rise to the need to explain the predictions and decisions of these models, often used as black-boxes. Explainable AI approaches are either numerical feature-based aiming to quantify the contribution of each feature in a prediction or symbolic providing certain forms of symbolic explanations such as counterfactuals. This paper proposes a generic agnostic approach named ASTERYX allowing to generate both symbolic explanations and score-based ones. Our approach is declarative and it is based on the encoding of the model to be explained in an equivalent symbolic representation, this latter serves to generate in particular two types of symbolic explanations which are sufficient reasons and counterfactuals. We then associate scores reflecting the relevance of the explanations and the features w.r.t to some properties. Our experimental results show the feasibility of the proposed approach and its effectiveness in providing symbolic and score-based explanations.