基于模式的文本分类器的论证解释

论文标题

基于模式的文本分类器的论证解释

Argumentative Explanations for Pattern-Based Text Classifiers

论文作者

Lertvittayakumjorn, Piyawat, Toni, Francesca

论文摘要

可解释的AI中的最新作品主要解决了黑盒模型的透明度问题，或为任何类型的模型（即它们是模型 - 不合Stic）创建解释，同时留下了可解释模型的解释，在很大程度上尚未得到解释。在本文中，我们通过关注特定可解释模型的解释，即基于模式的逻辑回归（PLR）来填补这一空白。我们这样做是因为尽管可以解释，但在解释方面，PLR具有挑战性。特别是，我们发现从该模型中提取解释的标准方法并不考虑这些特征之间的关系，这使得人类几乎不合理。因此，我们提出了AXPLR，这是一种使用（形式）计算论证的新颖解释方法来生成解释（对于由PLR计算的输出），从而在特征之间发掘了模型协议和分歧。具体而言，我们使用以下计算论证：我们将PLR中的特征（模式）视为量化的双极论证框架（QBAF）的形式的参数，并根据参数的特殊性提取参数之间的攻击和支持；我们将逻辑回归视为这些QBAF的逐渐语义，用于确定论点的辩证力强度。并且我们研究了QBAF的逐渐语义的标准性能，在我们对PLR的论点重新解释的背景下，批准了其用于解释性目的的适用性。然后，我们展示如何从构造的QBAF中提取直观解释（对于由PLR计算的输出）。最后，我们在人类协作的背景下进行了经验评估和两个实验，以证明我们所得的AXPLR方法的优势。

Recent works in Explainable AI mostly address the transparency issue of black-box models or create explanations for any kind of models (i.e., they are model-agnostic), while leaving explanations of interpretable models largely underexplored. In this paper, we fill this gap by focusing on explanations for a specific interpretable model, namely pattern-based logistic regression (PLR) for binary text classification. We do so because, albeit interpretable, PLR is challenging when it comes to explanations. In particular, we found that a standard way to extract explanations from this model does not consider relations among the features, making the explanations hardly plausible to humans. Hence, we propose AXPLR, a novel explanation method using (forms of) computational argumentation to generate explanations (for outputs computed by PLR) which unearth model agreements and disagreements among the features. Specifically, we use computational argumentation as follows: we see features (patterns) in PLR as arguments in a form of quantified bipolar argumentation frameworks (QBAFs) and extract attacks and supports between arguments based on specificity of the arguments; we understand logistic regression as a gradual semantics for these QBAFs, used to determine the arguments' dialectic strength; and we study standard properties of gradual semantics for QBAFs in the context of our argumentative re-interpretation of PLR, sanctioning its suitability for explanatory purposes. We then show how to extract intuitive explanations (for outputs computed by PLR) from the constructed QBAFs. Finally, we conduct an empirical evaluation and two experiments in the context of human-AI collaboration to demonstrate the advantages of our resulting AXPLR method.

下载PDF全文

下载文献需遵守相关版权规定

论文标题