论文标题
MBCT:基于树的特征感知包式,用于单个不确定性校准
MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration
论文作者
论文摘要
大多数机器学习分类器仅涉及分类准确性,而某些应用(例如医学诊断,气象预测和计算广告)需要该模型来预测真正的概率,称为校准估计值。在先前的工作中,研究人员开发了几种校准方法来后处理预测因子的输出,以获得校准值,例如binning和缩放方法。与缩放相比,嵌入方法显示具有无分布的理论保证,这激发了我们更喜欢校准方法进行校准。但是,我们注意到现有的binning方法有几个缺点:(a)Binning方案仅考虑原始预测值,从而限制了校准性能; (b)binning方法是非个体的,将垃圾箱中的多个样品映射到相同的值,因此不适合对订单敏感的应用。在本文中,我们提出了一个功能吸引的式框架,称为多个增强校准树(MBCT),以及多视图校准损失,以解决上述问题。我们的MBCT通过特征的树结构优化了包装方案,并在树节点中采用线性函数以实现单个校准。我们的MBCT是非单调的,由于其可学习的binning方案和个人校准,有可能提高订单准确性。我们在不同领域的三个数据集上进行了全面的实验。结果表明,我们的方法在校准误差和顺序准确性方面都优于所有竞争模型。我们还进行了仿真实验,证明所提出的多视图校准损失是对校准误差进行建模更好的指标。
Most machine learning classifiers only concern classification accuracy, while certain applications (such as medical diagnosis, meteorological forecasting, and computation advertising) require the model to predict the true probability, known as a calibrated estimate. In previous work, researchers have developed several calibration methods to post-process the outputs of a predictor to obtain calibrated values, such as binning and scaling methods. Compared with scaling, binning methods are shown to have distribution-free theoretical guarantees, which motivates us to prefer binning methods for calibration. However, we notice that existing binning methods have several drawbacks: (a) the binning scheme only considers the original prediction values, thus limiting the calibration performance; and (b) the binning approach is non-individual, mapping multiple samples in a bin to the same value, and thus is not suitable for order-sensitive applications. In this paper, we propose a feature-aware binning framework, called Multiple Boosting Calibration Trees (MBCT), along with a multi-view calibration loss to tackle the above issues. Our MBCT optimizes the binning scheme by the tree structures of features, and adopts a linear function in a tree node to achieve individual calibration. Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration. We conduct comprehensive experiments on three datasets in different fields. Results show that our method outperforms all competing models in terms of both calibration error and order accuracy. We also conduct simulation experiments, justifying that the proposed multi-view calibration loss is a better metric in modeling calibration error.