论文标题
文档是ROC:一种简单的现成方法,用于估计诊断研究中的平均人类绩效
Docs are ROCs: A simple off-the-shelf approach for estimating average human performance in diagnostic studies
论文作者
论文摘要
在诊断医学研究中,估计人类平均表现的估计不一致。这在医学人工智能领域尤为明显,在医学人工智能领域,通常将人类与多阅读器多案例研究中的AI模型进行比较,并且通常报道的指标,例如合并或平均的人类敏感性和特异性,将系统地低估人类专家的表现。我们介绍了摘要接收器操作特性曲线分析的使用,这是一种用于诊断测试精度研究的荟萃分析的技术,是一种明智且方法强大的替代方案。我们描述了使用这些方法的动机,并提出了将这些荟萃分析技术应用于少数著名的医学AI研究的结果。
Estimating average human performance has been performed inconsistently in research in diagnostic medicine. This has been particularly apparent in the field of medical artificial intelligence, where humans are often compared against AI models in multi-reader multi-case studies, and commonly reported metrics such as the pooled or average human sensitivity and specificity will systematically underestimate the performance of human experts. We present the use of summary receiver operating characteristic curve analysis, a technique commonly used in the meta-analysis of diagnostic test accuracy studies, as a sensible and methodologically robust alternative. We describe the motivation for using these methods and present results where we apply these meta-analytic techniques to a handful of prominent medical AI studies.