论文标题
超越严格的正确评分规则:成为本地的重要性
Beyond Strictly Proper Scoring Rules: The Importance of Being Local
论文作者
论文摘要
概率预测的评估在解释和使用预测系统及其发展中都起着核心作用。概率分数(评分规则)提供了统计措施来评估概率预测的质量。通常,许多概率的预测系统都可以使用,而评估其性能的评估不是标准化的,并且使用不同的评分规则来衡量预测性能的不同方面。即使讨论仅限于严格适当的评分规则,它们之间仍然存在很大的差异。实际上,当这些系统都不是完美时,严格正确的得分规则不必按照相同顺序对竞争的预测系统进行排名。探索了局部属性以进一步区分评分规则。非本地的严格评分规则被证明具有可以产生“不幸”评估的财产。尤其是,连续等级概率分数更喜欢接近预测分布的中位数的结果,而不管分配给中位数的概率质量质量是什么都引起了其使用的关注。唯一的本地严格得分规则,对数得分,就概率和信息位而言有直接的解释。另一方面,非本地的严格评分规则缺乏有意义的直接解释决策支持。在预测变量的平滑转换下,对数得分也被证明是不变的,而非本地的严格评分规则可能会因转换而改变其偏好。因此,建议对数得分始终包括在概率预测的评估中。
The evaluation of probabilistic forecasts plays a central role both in the interpretation and in the use of forecast systems and their development. Probabilistic scores (scoring rules) provide statistical measures to assess the quality of probabilistic forecasts. Often, many probabilistic forecast systems are available while evaluations of their performance are not standardized, with different scoring rules being used to measure different aspects of forecast performance. Even when the discussion is restricted to strictly proper scoring rules, there remains considerable variability between them; indeed strictly proper scoring rules need not rank competing forecast systems in the same order when none of these systems are perfect. The locality property is explored to further distinguish scoring rules. The nonlocal strictly proper scoring rules considered are shown to have a property that can produce "unfortunate" evaluations. Particularly the fact that Continuous Rank Probability Score prefers the outcome close to the median of the forecast distribution regardless the probability mass assigned to the value at/near the median raises concern to its use. The only local strictly proper scoring rules, the logarithmic score, has direct interpretations in terms of probabilities and bits of information. The nonlocal strictly proper scoring rules, on the other hand, lack meaningful direct interpretation for decision support. The logarithmic score is also shown to be invariant under smooth transformation of the forecast variable, while the nonlocal strictly proper scoring rules considered may, however, change their preferences due to the transformation. It is therefore suggested that the logarithmic score always be included in the evaluation of probabilistic forecasts.