论文标题

利用电子健康记录以获取现实世界证据

Harnessing electronic health records for real-world evidence

论文作者

Hou, Jue, Zhao, Rachel, Gronsbell, Jessica, Beaulieu-Jones, Brett K., Webber, Griffin, Jemielita, Thomas, Wan, Shuyan, Hong, Chuan, Lin, Yucong, Cai, Tianrun, Wen, Jun, Panickan, Vidul A., Bonzel, Clara-Lea, Liaw, Kai-Li, Liao, Katherine P., Cai, Tianxi

论文摘要

虽然随机对照试验(RCT)是建立医疗治疗的有效性和安全性的金标准,但由现实世界数据(RWD)产生的现实世界证据(RWE)对于批准后监测至关重要,并且正在促进实验疗法的调节过程。 RWD的新兴来源是电子健康记录(EHR),其中包含有关结构化(例如,诊断代码)和非结构化(例如,临床注释,图像)形式的详细信息。尽管EHR中可用的数据具有颗粒状,但可靠地评估治疗与临床结果之间关系所需的关键变量可能具有挑战性。我们提供了一项集成的数据策展和建模管道,利用自然语言处理,计算表型,使用嘈杂数据的建模技术的最新进展,以应对这一基本挑战,并加速对RWE的EHR使用以及数字双胞胎的可靠使用。拟议的管道高度自动执行该任务,并包括部署指南。还从现有的有关RCT的EHR仿真的文献中得出了例子,并伴随着我们对大规模杨百翰(MGB)EHR的研究。

While randomized controlled trials (RCTs) are the gold-standard for establishing the efficacy and safety of a medical treatment, real-world evidence (RWE) generated from real-world data (RWD) has been vital in post-approval monitoring and is being promoted for the regulatory process of experimental therapies. An emerging source of RWD is electronic health records (EHRs), which contain detailed information on patient care in both structured (e. g., diagnosis codes) and unstructured (e. g., clinical notes, images) form. Despite the granularity of the data available in EHRs, critical variables required to reliably assess the relationship between a treatment and clinical outcome can be challenging to extract. We provide an integrated data curation and modeling pipeline leveraging recent advances in natural language processing, computational phenotyping, modeling techniques with noisy data to address this fundamental challenge and accelerate the reliable use of EHRs for RWE, as well as the creation of digital twins. The proposed pipeline is highly automated for the task and includes guidance for deployment. Examples are also drawn from existing literature on EHR emulation of RCT and accompanied by our own studies with Mass General Brigham (MGB) EHR.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源