基于基于自举的乐观校正方法的多变量预测模型的预测准确性度量的置信区间

论文标题

基于基于自举的乐观校正方法的多变量预测模型的预测准确性度量的置信区间

Confidence intervals of prediction accuracy measures for multivariable prediction models based on the bootstrap-based optimism correction methods

论文作者

Noma, Hisashi, Shinozaki, Tomohiro, Iba, Katsuhiro, Teramukai, Satoshi, Furukawa, Toshi A.

论文摘要

在评估多变量预测模型的预测准确性时，乐观校正对于预防偏见结果至关重要。但是，在大多数发表的临床预测模型论文中，预测准确性度量的点估计值是通过足够基于自举的校正方法来纠正的，但是没有校正其置信区间，例如，DeLong的置信区间通常用于评估C-Statistic。这些幼稚的方法不会针对乐观偏见调整，也不考虑预测模型中参数估计的统计变异性。因此，他们对预测准确性度量的真实价值的覆盖率可能会严重低于标称水平（例如95％）。在本文中，我们提供了两种通用的引导方法，即（1）位置切换的引导置信区间和（2）两个阶段的引导置信区间，通常可以应用于基于自动启动的乐观校正方法，即Harrell的偏见偏差校正，0.632和0.632和0.632+方法。此外，它们可以广泛应用于涉及现代收缩方法（例如山脊和拉索回归）的预测模型开发的各种方法。通过模拟的数值评估，提出的置信区间显示出良好的覆盖范围表现。此外，基于乐观校正方法的当前标准实践显示出严重的秘密特性。为了避免错误的结果，不应在实践中使用乐观校正的置信区间，建议使用调整后的方法。我们还开发了用于实现这些方法的R软件包泼卜其（https://github.com/nomahi/predboot）。通过在Gusto-I临床试验中应用，提出的方法的有效性将说明。

In assessing prediction accuracy of multivariable prediction models, optimism corrections are essential for preventing biased results. However, in most published papers of clinical prediction models, the point estimates of the prediction accuracy measures are corrected by adequate bootstrap-based correction methods, but their confidence intervals are not corrected, e.g., the DeLong's confidence interval is usually used for assessing the C-statistic. These naive methods do not adjust for the optimism bias and do not account for statistical variability in the estimation of parameters in the prediction models. Therefore, their coverage probabilities of the true value of the prediction accuracy measure can be seriously below the nominal level (e.g., 95%). In this article, we provide two generic bootstrap methods, namely (1) location-shifted bootstrap confidence intervals and (2) two-stage bootstrap confidence intervals, that can be generally applied to the bootstrap-based optimism correction methods, i.e., the Harrell's bias correction, 0.632, and 0.632+ methods. In addition, they can be widely applied to various methods for prediction model development involving modern shrinkage methods such as the ridge and lasso regressions. Through numerical evaluations by simulations, the proposed confidence intervals showed favourable coverage performances. Besides, the current standard practices based on the optimism-uncorrected methods showed serious undercoverage properties. To avoid erroneous results, the optimism-uncorrected confidence intervals should not be used in practice, and the adjusted methods are recommended instead. We also developed the R package predboot for implementing these methods (https://github.com/nomahi/predboot). The effectiveness of the proposed methods are illustrated via applications to the GUSTO-I clinical trial.

下载PDF全文

下载文献需遵守相关版权规定

论文标题