论文标题

深度学习在生成结构化放射学报告中的应用:基于变压器的技术

Application of Deep Learning in Generating Structured Radiology Reports: A Transformer-Based Technique

论文作者

Moezzi, Seyed Ali Reza, Ghaedi, Abdolrahman, Rahmanian, Mojdeh, Mousavi, Seyedeh Zahra, Sami, Ashkan

论文摘要

由于临床实践所需的放射学报告和研究是在自由文本叙述中编写和存储的,因此很难提取相对信息以进行进一步分析。在这种情况下,自然语言处理(NLP)技术可以促进自动信息提取和自由文本格式向结构化数据的转换。近年来,基于深度学习(DL)的模型已适用于NLP实验,并具有令人鼓舞的结果。尽管基于人工神经网络(ANN)和卷积神经网络(CNN)的DL模型具有巨大的潜力,但在临床实践中,这些模型仍面临一些局限性。变形金刚是另一种新的DL体系结构,已越来越多地用于改善流程。因此,在这项研究中,我们提出了一种基于变压器的细粒命名实体识别(NER)架构,以进行临床信息提取。我们以自由文本格式收集了88次腹部超声检查报告,并根据我们开发的信息模式注释它们。将文本到文本传输变压器模型(T5)和covive(一种预训练的域特异性适应T5模型)应用于微调来提取实体和关系,并将输入转换为结构化的格式。我们基于变压器的模型分别优于先前应用的方法,例如基于Rouge-1,Rouge-2,Rouge-L和BLEU分别为0.816、0.668、0.528和0.743的ANN和CNN模型,同时提供了可解释的结构性报告。

Since radiology reports needed for clinical practice and research are written and stored in free-text narrations, extraction of relative information for further analysis is difficult. In these circumstances, natural language processing (NLP) techniques can facilitate automatic information extraction and transformation of free-text formats to structured data. In recent years, deep learning (DL)-based models have been adapted for NLP experiments with promising results. Despite the significant potential of DL models based on artificial neural networks (ANN) and convolutional neural networks (CNN), the models face some limitations to implement in clinical practice. Transformers, another new DL architecture, have been increasingly applied to improve the process. Therefore, in this study, we propose a transformer-based fine-grained named entity recognition (NER) architecture for clinical information extraction. We collected 88 abdominopelvic sonography reports in free-text formats and annotated them based on our developed information schema. The text-to-text transfer transformer model (T5) and Scifive, a pre-trained domain-specific adaptation of the T5 model, were applied for fine-tuning to extract entities and relations and transform the input into a structured format. Our transformer-based model in this study outperformed previously applied approaches such as ANN and CNN models based on ROUGE-1, ROUGE-2, ROUGE-L, and BLEU scores of 0.816, 0.668, 0.528, and 0.743, respectively, while providing an interpretable structured report.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源