使用验证的语言模型和电子健康记录环境提取生物医学事实知识

论文标题

使用验证的语言模型和电子健康记录环境提取生物医学事实知识

Extracting Biomedical Factual Knowledge Using Pretrained Language Model and Electronic Health Record Context

论文作者

Yao, Zonghai, Cao, Yi, Yang, Zhichao, Deshpande, Vijeta, Yu, Hong

论文摘要

语言模型（LMS）在生物医学自然语言处理应用上表现良好。在这项研究中，我们进行了一些实验，以使用及时的方法来从LMS中提取知识基础（LMS为KBS）。但是，提示只能用作知识提取的低结合，并且在生物医学领域KBS上的表现特别差。为了使LMS作为KBS更符合生物医学领域的实际应用程序方案，我们特别将EHR注释作为上下文添加到提示中，以改善生物医学域中的低结合。我们设计并验证了一系列针对我们的动态文化 - 博奥拉玛任务的实验。我们的实验表明，这些语言模型所拥有的知识可以将正确的知识与EHR注释中的噪声知识区分开来，并且这种区别能力也可以用作新的指标来评估模型所拥有的知识量。

Language Models (LMs) have performed well on biomedical natural language processing applications. In this study, we conducted some experiments to use prompt methods to extract knowledge from LMs as new knowledge Bases (LMs as KBs). However, prompting can only be used as a low bound for knowledge extraction, and perform particularly poorly on biomedical domain KBs. In order to make LMs as KBs more in line with the actual application scenarios of the biomedical domain, we specifically add EHR notes as context to the prompt to improve the low bound in the biomedical domain. We design and validate a series of experiments for our Dynamic-Context-BioLAMA task. Our experiments show that the knowledge possessed by those language models can distinguish the correct knowledge from the noise knowledge in the EHR notes, and such distinguishing ability can also be used as a new metric to evaluate the amount of knowledge possessed by the model.

下载PDF全文

下载文献需遵守相关版权规定

论文标题