设计可解释：通过构成可解释的查询来学习预测指标

论文标题

设计可解释：通过构成可解释的查询来学习预测指标

Interpretable by Design: Learning Predictors by Composing Interpretable Queries

论文作者

Chattopadhyay, Aditya, Slocum, Stewart, Haeffele, Benjamin D., Vidal, Rene, Geman, Donald

论文摘要

对于使用高性能机器学习算法通常不透明的决策，人们越来越担心。用特定于领域的术语对推理过程的解释对于在医疗保健等风险敏感领域中的采用至关重要。我们认为，机器学习算法应该可以通过设计来解释，并且表达这些解释的语言应与域和任务有关。因此，我们将模型的预测基于数据的用户定义和特定于任务的二进制功能，每个都对最终用户有明确的解释。然后，我们将准确预测所需的预期查询数量最小化。由于解决方案通常是棘手的，因此，在事先工作之后，我们根据信息增益顺序选择查询。但是，与以前的工作相反，我们不必假设查询在有条件地独立。取而代之的是，我们利用随机生成模型（VAE）和MCMC算法（未经调整的Langevin）来选择基于先前的查询 - 答案的输入的最有用的查询。这使得在线确定要解决预测歧义所需的任何深度的查询链。最后，关于视觉和NLP任务的实验证明了我们的方法的功效及其优于事后解释。

There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive domains such as healthcare. We argue that machine learning algorithms should be interpretable by design and that the language in which these interpretations are expressed should be domain- and task-dependent. Consequently, we base our model's prediction on a family of user-defined and task-specific binary functions of the data, each having a clear interpretation to the end-user. We then minimize the expected number of queries needed for accurate prediction on any given input. As the solution is generally intractable, following prior work, we choose the queries sequentially based on information gain. However, in contrast to previous work, we need not assume the queries are conditionally independent. Instead, we leverage a stochastic generative model (VAE) and an MCMC algorithm (Unadjusted Langevin) to select the most informative query about the input based on previous query-answers. This enables the online determination of a query chain of whatever depth is required to resolve prediction ambiguities. Finally, experiments on vision and NLP tasks demonstrate the efficacy of our approach and its superiority over post-hoc explanations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题