从协作网络估算不确定性间隔

论文标题

从协作网络估算不确定性间隔

Estimating Uncertainty Intervals from Collaborating Networks

论文作者

Zhou, Tianhui, Li, Yitong, Wu, Yuan, Carlson, David

论文摘要

有效的决策需要理解预测中固有的不确定性。在回归中，这种不确定性可以通过多种方法来估计。但是，其中许多方法都在努力调整，产生过度自信的不确定性间隔或缺乏清晰度（给出不精确的间隔）。我们通过提出一种新的方法来解决这些挑战，以通过定义具有两个不同损失函数的两个神经网络来捕获回归中的预测分布。具体而言，一个网络近似累积分布函数，第二个网络近似其逆。我们将此方法称为协作网络（CN）。理论分析表明，优化的固定点是在理想化的解决方案处，并且该方法在地面真相分布上渐近一致。从经验上讲，学习是简单而坚固的。我们对两个合成和六个现实世界数据集的几种常见方法进行了基准测试，包括从不确定性至关重要的电子健康记录中预测糖尿病患者的A1C值。在合成数据中，提出的方法基本上与地面真相相匹配。在现实世界数据集中，CN改善了许多性能指标的结果，包括对数可能估计，平均绝对错误，覆盖范围估计值和预测间隔宽度。

Effective decision making requires understanding the uncertainty inherent in a prediction. In regression, this uncertainty can be estimated by a variety of methods; however, many of these methods are laborious to tune, generate overconfident uncertainty intervals, or lack sharpness (give imprecise intervals). We address these challenges by proposing a novel method to capture predictive distributions in regression by defining two neural networks with two distinct loss functions. Specifically, one network approximates the cumulative distribution function, and the second network approximates its inverse. We refer to this method as Collaborating Networks (CN). Theoretical analysis demonstrates that a fixed point of the optimization is at the idealized solution, and that the method is asymptotically consistent to the ground truth distribution. Empirically, learning is straightforward and robust. We benchmark CN against several common approaches on two synthetic and six real-world datasets, including forecasting A1c values in diabetic patients from electronic health records, where uncertainty is critical. In the synthetic data, the proposed approach essentially matches ground truth. In the real-world datasets, CN improves results on many performance metrics, including log-likelihood estimates, mean absolute errors, coverage estimates, and prediction interval widths.

下载PDF全文

下载文献需遵守相关版权规定

论文标题