论文标题

可解释的机器学习的半参数方法

A Semiparametric Approach to Interpretable Machine Learning

论文作者

Sani, Numair, Lee, Jaron, Nabi, Razieh, Shpitser, Ilya

论文摘要

机器学习中的黑匣子模型已在复杂问题和高维设置中表现出出色的预测性能。但是,它们缺乏透明度和解释性限制了此类模型在关键决策过程中的适用性。为了消除这一缺点,我们提出了一种新的方法,可以使用半参数统计的想法在预测模型中进行交易,从而使我们能够将参数回归模型的可解释性与非参数方法的性能相结合。我们通过利用两件式模型来实现这一目标:第一件是可解释和参数的,第二件添加了第二件,添加了第二件。使用足够降低文献中的方法优化整个模型的性能。基于影响函数的估计器被得出并证明是双重鲁棒的。这允许在估计我们的模型参数时使用诸如双机器学习之类的方法。我们通过模拟研究和基于数据应用程序的数据应用来说明我们的方法的实用性,以预测手术患者的重症监护病房的住宿时间。

Black box models in machine learning have demonstrated excellent predictive performance in complex problems and high-dimensional settings. However, their lack of transparency and interpretability restrict the applicability of such models in critical decision-making processes. In order to combat this shortcoming, we propose a novel approach to trading off interpretability and performance in prediction models using ideas from semiparametric statistics, allowing us to combine the interpretability of parametric regression models with performance of nonparametric methods. We achieve this by utilizing a two-piece model: the first piece is interpretable and parametric, to which a second, uninterpretable residual piece is added. The performance of the overall model is optimized using methods from the sufficient dimension reduction literature. Influence function based estimators are derived and shown to be doubly robust. This allows for use of approaches such as double Machine Learning in estimating our model parameters. We illustrate the utility of our approach via simulation studies and a data application based on predicting the length of stay in the intensive care unit among surgery patients.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源