论文标题
有序的伯努利变体观察到的累积差异和期望值之间的累积差异图
Plots of the cumulative differences between observed and expected values of ordered Bernoulli variates
论文作者
论文摘要
许多预测本质上是概率的。例如,明天可能是降水的预测,但机会只有30%。鉴于预测和实际结果,“可靠性图”(也称为“校准图”)有助于检测和诊断预测和结果之间的统计学意义差异。规范的可靠性图是基于直方图预测的观察到的和期望值。标准可靠性图的几种变体建议使用与垃圾箱宽度相似的宽度的平滑卷积核代替软内核密度估算的硬直方图。在所有情况下,自然出现一个重要的问题:哪些宽度是最好的(或多个宽度更好的地块)?与其回答这个问题,不如将观察到和期望值之间累积差异的图在很大程度上避免了这个问题,它是通过直接显示出误校准作为图形的斜线斜率。即使固定线的恒定偏移无关紧要,坡度也很容易被定量精度感知。无需用一些任意的内核进行bin或执行内核密度估计。
Many predictions are probabilistic in nature; for example, a prediction could be for precipitation tomorrow, but with only a 30 percent chance. Given both the predictions and the actual outcomes, "reliability diagrams" (also known as "calibration plots") help detect and diagnose statistically significant discrepancies between the predictions and the outcomes. The canonical reliability diagrams are based on histogramming the observed and expected values of the predictions; several variants of the standard reliability diagrams propose to replace the hard histogram binning with soft kernel density estimation using smooth convolutional kernels of widths similar to the widths of the bins. In all cases, an important question naturally arises: which widths are best (or are multiple plots with different widths better)? Rather than answering this question, plots of the cumulative differences between the observed and expected values largely avoid the question, by displaying miscalibration directly as the slopes of secant lines for the graphs. Slope is easy to perceive with quantitative precision even when the constant offsets of the secant lines are irrelevant. There is no need to bin or perform kernel density estimation with a somewhat arbitrary kernel.