论文标题
撰写:用于患者试验匹配的跨模式伪塞亚姆网络
COMPOSE: Cross-Modal Pseudo-Siamese Network for Patient Trial Matching
论文作者
论文摘要
临床试验在药物开发中起着重要作用,但经常遭受昂贵,不准确和不足的患者招募。大规模电子健康记录(EHR)数据的可用性和试验资格标准(EC)为数据驱动的患者招聘带来了新的机会。一项名为患者试验匹配的关键任务是为鉴于结构化的EHR和非结构化EC文本(包括和排除标准)找到合格的患者进行临床试验。如何将复杂的EC文本与纵向患者EHR匹配?如何嵌入患者和试验之间的多对多关系?如何明确处理包容性和排除标准之间的差异?在本文中,我们提出了跨模式伪塞亚姆网络(Compose),以应对患者 - 试验匹配的这些挑战。网络的一条路径使用卷积高速公路网络编码EC。其他路径通过多粒性记忆网络处理EHR,该网络将结构化的患者记录编码为基于医学本体的多个层次。使用EC嵌入作为查询,构成执行注意记录对齐,从而实现动态的患者和试验匹配。撰写还引入了一个复合损失项,以最大程度地提高患者记录与包含标准之间的相似性,同时最大程度地减少与排除标准的相似性。实验结果表明,组成的患者标准匹配可以达到98.0%的AUC,并且在患者审判匹配方面的准确性为83.7%,这比现实世界中的患者 - 试验匹配任务的最佳基线提高了24.3%。
Clinical trials play important roles in drug development but often suffer from expensive, inaccurate and insufficient patient recruitment. The availability of massive electronic health records (EHR) data and trial eligibility criteria (EC) bring a new opportunity to data driven patient recruitment. One key task named patient-trial matching is to find qualified patients for clinical trials given structured EHR and unstructured EC text (both inclusion and exclusion criteria). How to match complex EC text with longitudinal patient EHRs? How to embed many-to-many relationships between patients and trials? How to explicitly handle the difference between inclusion and exclusion criteria? In this paper, we proposed CrOss-Modal PseudO-SiamEse network (COMPOSE) to address these challenges for patient-trial matching. One path of the network encodes EC using convolutional highway network. The other path processes EHR with multi-granularity memory network that encodes structured patient records into multiple levels based on medical ontology. Using the EC embedding as query, COMPOSE performs attentional record alignment and thus enables dynamic patient-trial matching. COMPOSE also introduces a composite loss term to maximize the similarity between patient records and inclusion criteria while minimize the similarity to the exclusion criteria. Experiment results show COMPOSE can reach 98.0% AUC on patient-criteria matching and 83.7% accuracy on patient-trial matching, which leads 24.3% improvement over the best baseline on real-world patient-trial matching tasks.