论文标题

使神经网络可解释使用归因:应用于隐式信号预测

Making Neural Networks Interpretable with Attribution: Application to Implicit Signals Prediction

论文作者

Afchar, Darius, Hennequin, Romain

论文摘要

解释建议使用户能够了解推荐的项目是否与其需求相关,并已被证明可以增加对系统的信任。更一般而言,如果设计可解释的机器学习模型是检查决策过程的理智和鲁棒性并提高其效率的关键,那么对于复杂的架构来说,这仍然是一个挑战,尤其是经常被认为是“黑盒”的深神经网络。在本文中,我们提出了一种可解释的深层神经网络的新颖表述,以实现归因任务。与流行的事后方法不同,我们的方法可以通过设计来解释。使用蒙版重量,可以将隐藏的功能深深归因,分为几个输入限制子网络,并作为专家的增强混合物进行了训练。关于合成数据和现实世界建议任务的实验结果表明,我们的方法使建立模型可以实现对其非解剖功能的近距离预测性能,同时提供信息归因的解释。

Explaining recommendations enables users to understand whether recommended items are relevant to their needs and has been shown to increase their trust in the system. More generally, if designing explainable machine learning models is key to check the sanity and robustness of a decision process and improve their efficiency, it however remains a challenge for complex architectures, especially deep neural networks that are often deemed "black-box". In this paper, we propose a novel formulation of interpretable deep neural networks for the attribution task. Differently to popular post-hoc methods, our approach is interpretable by design. Using masked weights, hidden features can be deeply attributed, split into several input-restricted sub-networks and trained as a boosted mixture of experts. Experimental results on synthetic data and real-world recommendation tasks demonstrate that our method enables to build models achieving close predictive performances to their non-interpretable counterparts, while providing informative attribution interpretations.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源