论文标题
通过合奏提升梯度的不确定性
Uncertainty in Gradient Boosting via Ensembles
论文作者
论文摘要
对于许多实用的高风险应用程序,必须在模型的预测中量化不确定性,以避免昂贵的错误。尽管对神经网络的预测不确定性进行了广泛的研究,但基于梯度提升的模型,该主题似乎还不足。但是,提升梯度通常会在表格数据上实现最先进的结果。这项工作研究了一个基于概率集成的框架,用于在梯度增强分类和回归模型的预测中得出不确定性估计。我们对一系列合成和真实数据集进行了实验,并研究了集成方法对梯度增强模型的适用性,这些模型本身就是决策树的集合。我们的分析表明,梯度增强模型的集合成功地检测了异常输入,同时提高预测的总不确定性的能力有限。重要的是,我们还提出了一个虚拟合奏的概念,仅通过一个梯度增强模型来获得合奏的好处,从而大大降低了复杂性。
For many practical, high-risk applications, it is essential to quantify uncertainty in a model's predictions to avoid costly mistakes. While predictive uncertainty is widely studied for neural networks, the topic seems to be under-explored for models based on gradient boosting. However, gradient boosting often achieves state-of-the-art results on tabular data. This work examines a probabilistic ensemble-based framework for deriving uncertainty estimates in the predictions of gradient boosting classification and regression models. We conducted experiments on a range of synthetic and real datasets and investigated the applicability of ensemble approaches to gradient boosting models that are themselves ensembles of decision trees. Our analysis shows that ensembles of gradient boosting models successfully detect anomalous inputs while having limited ability to improve the predicted total uncertainty. Importantly, we also propose a concept of a virtual ensemble to get the benefits of an ensemble via only one gradient boosting model, which significantly reduces complexity.