论文标题
统一距离校准的理论
A Unifying Theory of Distance from Calibration
论文作者
论文摘要
我们研究了如何定义和测量概率预测因子校准的距离的基本问题。虽然完美校准的概念是充分理解的,但如何量化与完美校准的距离尚无共识。文献中已经提出了许多校准措施,但是目前尚不清楚它们彼此相比如何,许多流行的措施(例如预期的校准误差(ECE))无法满足诸如连续性之类的基本属性。 我们提出了一个严格的框架,用于分析校准措施,灵感来自有关财产测试的文献。我们提出了一个与校准的距离的基本真相概念:$ \ ell_1 $距离到最近完美校准的预测器。我们将一致的校准度量定义为与此距离多项式相关的校准度量。应用我们的框架,我们确定了三种一致且可以有效估算的校准度量:平滑校准,间隔校准和拉普拉斯内核校准。前两个给出了二次近似地面真相距离,我们显示的是在自然模型中进行信息理论上最佳的,用于测量校准,我们将其称为仅预测访问模型。因此,我们的工作在测量校准距离方面建立了基本的下限和上限,并且还为在实践中更喜欢某些指标(例如Laplace内核校准)提供了理论上的理由。
We study the fundamental question of how to define and measure the distance from calibration for probabilistic predictors. While the notion of perfect calibration is well-understood, there is no consensus on how to quantify the distance from perfect calibration. Numerous calibration measures have been proposed in the literature, but it is unclear how they compare to each other, and many popular measures such as Expected Calibration Error (ECE) fail to satisfy basic properties like continuity. We present a rigorous framework for analyzing calibration measures, inspired by the literature on property testing. We propose a ground-truth notion of distance from calibration: the $\ell_1$ distance to the nearest perfectly calibrated predictor. We define a consistent calibration measure as one that is polynomially related to this distance. Applying our framework, we identify three calibration measures that are consistent and can be estimated efficiently: smooth calibration, interval calibration, and Laplace kernel calibration. The former two give quadratic approximations to the ground truth distance, which we show is information-theoretically optimal in a natural model for measuring calibration which we term the prediction-only access model. Our work thus establishes fundamental lower and upper bounds on measuring the distance to calibration, and also provides theoretical justification for preferring certain metrics (like Laplace kernel calibration) in practice.