我们走的学习：检查COVID19每日死亡人数预测的统计准确性的检查

论文标题

我们走的学习：检查COVID19每日死亡人数预测的统计准确性的检查

Learning as We Go: An Examination of the Statistical Accuracy of COVID19 Daily Death Count Predictions

论文作者

Marchant, Roman, Samia, Noelle I., Rosen, Ori, Tanner, Martin A., Cripps, Sally

论文摘要

本文对卫生指标与评估研究所（IHME）开发的模型（及其各种更新）的预测性能进行了正式评估，该模型（IHME）预测了美国每个州归因于Covid19的每日死亡。 IHME模型在社会和大众媒体上受到了广泛的关注，并影响了美国政府最高级别的决策者。为了有效的政策，必须考虑对不确定性的准确评估以及准确的点预测，因为必须考虑决策中固有的风险，尤其是在当前影响数百万生命的新型疾病的情况下。为了评估IHME模型的准确性，我们检查了IHME模型提供的95％预测间隔的预测准确性以及预测性能。我们发现，最初的IHME模型低估了围绕每日死亡人数的不确定性。具体而言，与预期价值相比，第二天的真实数量落在IHME预测间隔之外的时间高达70％。此外，我们注意到，初始模型的性能不会随着预测范围的较短而改善。关于更新的模型，我们的分析表明，以后的模型在点估计预测的准确性上没有任何改善。实际上，有一些证据表明，这种准确性实际上在初始模型中有所下降。此外，在考虑更新的模型时，尽管我们观察到更大的具有实际值的状态位于95％的预测间隔内（PI），但我们的分析表明，该观察结果可能归因于PI的扩大。这些间隔的宽度引起了质疑预测以推动策略制定和资源分配的有用性。

This paper provides a formal evaluation of the predictive performance of a model (and its various updates) developed by the Institute for Health Metrics and Evaluation (IHME) for predicting daily deaths attributed to COVID19 for each state in the United States. The IHME models have received extensive attention in social and mass media, and have influenced policy makers at the highest levels of the United States government. For effective policy making the accurate assessment of uncertainty, as well as accurate point predictions, are necessary because the risks inherent in a decision must be taken into account, especially in the present setting of a novel disease affecting millions of lives. To assess the accuracy of the IHME models, we examine both forecast accuracy as well as the predictive performance of the 95% prediction intervals provided by the IHME models. We find that the initial IHME model underestimates the uncertainty surrounding the number of daily deaths substantially. Specifically, the true number of next day deaths fell outside the IHME prediction intervals as much as 70% of the time, in comparison to the expected value of 5%. In addition, we note that the performance of the initial model does not improve with shorter forecast horizons. Regarding the updated models, our analyses indicate that the later models do not show any improvement in the accuracy of the point estimate predictions. In fact, there is some evidence that this accuracy has actually decreased over the initial models. Moreover, when considering the updated models, while we observe a larger percentage of states having actual values lying inside the 95% prediction intervals (PI), our analysis suggests that this observation may be attributed to the widening of the PIs. The width of these intervals calls into question the usefulness of the predictions to drive policy making and resource allocation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题