论文标题
将患者EHR数据与斯坦福医学上的外部死亡数据联系起来的研究
A study linking patient EHR data to external death data at Stanford Medicine
论文作者
论文摘要
本手稿探讨了在研究临床数据仓库(R-CDW)的背景下,将现实世界患者数据与外部死亡数据联系起来。我们专门介绍了斯坦福医疗保健(SHC)患者的电子健康记录(EHR)数据的链接以及社会保障局(SSA)有限访问死亡主文件(LADMF)的数据,该数据由美国商务部国家技术信息服务(NTIS)提供。 本手稿中介绍的数据分析框架扩展了先前的方法,并且可以推广用于链接任意两个跨组织现实世界的患者数据源。电子健康记录(EHR)数据和NTIS LADMF在其他医疗中心被大量使用,我们希望此处介绍的方法和学习对他人很有价值。我们的发现表明,牢固的联系是不完整的,弱点是嘈杂的,即没有良好的联系规则可以提供覆盖范围和准确性。此外,任何两个数据集的最佳链接规则与其他两个数据集的最佳链接规则不同,即没有链接规则的概括。最后,LADMF是R-CDW的常用外部死亡数据资源,在死亡数据中存在很大的差距,这使得R-CDW必须寻找多个外部死亡数据来源。我们预计,多个链接的介绍将使最终用户很难提出链接结果。 该手稿是支持斯坦福医学Starr(Stanford Medicine研究数据存储库)R-CDWS的资源。数据在我们的HIPAA兼容数据中心中存储和分析为PHI,并在Starr IRB的研发(R&D)活动中使用。
This manuscript explores linking real-world patient data with external death data in the context of research Clinical Data Warehouses (r-CDWs). We specifically present the linking of Electronic Health Records (EHR) data for Stanford Health Care (SHC) patients and data from the Social Security Administration (SSA) Limited Access Death Master File (LADMF) made available by the US Department of Commerce's National Technical Information Service (NTIS). The data analysis framework presented in this manuscript extends prior approaches and is generalizable to linking any two cross-organizational real-world patient data sources. Electronic Health Record (EHR) data and NTIS LADMF are heavily used resources at other medical centers and we expect that the methods and learnings presented here will be valuable to others. Our findings suggest that strong linkages are incomplete and weak linkages are noisy i.e., there is no good linkage rule that provides coverage and accuracy. Furthermore, the best linkage rule for any two datasets is different from the best linkage rule for two other datasets i.e., there is no generalization of linkage rules. Finally, LADMF, a commonly used external death data resource for r-CDWs, has a significant gap in death data making it necessary for r-CDWs to seek out more than one external death data source. We anticipate that presentation of multiple linkages will make it hard to present the linkage outcome to the end user. This manuscript is a resource in support of Stanford Medicine STARR (STAnford medicine Research data Repository) r-CDWs. The data are stored and analyzed as PHI in our HIPAA-compliant data center and are used under research and development (R&D) activities of STARR IRB.