论文标题
跟踪沿翻译连续体的生物医学文章:基于生物医学知识表示的度量
Tracking biomedical articles along the translational continuum: a measure based on biomedical knowledge representation
论文作者
论文摘要
跟踪转化研究对于评估翻译医学方案的表现至关重要。尽管在先前的研究中有几个指标,但仍需要一项共识措施来代表文章级别的生物医学研究的翻译特征。在这项研究中,我们首先培训了基于超过3000万个PubMed文章的生物医学实体和文档(即Bio Entity2Vec和Bio Doc2Vec)的语义表示。借助这些向量,我们开发了一种称为转化进度(TP)的新方法,用于跟踪沿翻译连续体的生物医学文章。我们从两个角度(临床试验阶段识别和ACH分类)验证了TP的有效性,这在TP和其他指标之间表现出了极好的一致性。同时,TP具有几个优势。首先,它可以动态和实时跟踪生物医学研究的翻译程度。其次,解释和操作很简单。第三,它不需要劳动密集型网格标签,它适用于大型学术数据以及在PubMed中未索引的论文。此外,我们从三个维度(包括整体分布,时间和研究主题)研究了生物医学研究的翻译进展,这揭示了三个重要的发现。这项研究中提出的措施可以由政策制定者使用,以实时的高转化潜力监测生物医学研究,并做出更好的决策。它也可以用于其他领域(例如物理学或计算机科学)来评估科学发现的应用价值。
Keeping track of translational research is essential to evaluating the performance of programs on translational medicine. Despite several indicators in previous studies, a consensus measure is still needed to represent the translational features of biomedical research at the article level. In this study, we first trained semantic representations of biomedical entities and documents (i.e., bio entity2vec and bio doc2vec) based on over 30 million PubMed articles. With these vectors, we then developed a new measure called Translational Progression (TP) for tracking biomedical articles along the translational continuum. We validated the effectiveness of TP from two perspectives (Clinical trial phase identification and ACH classification), which showed excellent consistency between TP and other indicators. Meanwhile, TP has several advantages. First, it can track the degree of translation of biomedical research dynamically and in real time. Second, it is straightforward to interpret and operationalize. Third, it doesn%u2019t require labor-intensive MeSH labeling and it is suitable for big scholarly data as well as papers that are not indexed in PubMed. In addition, we examined the translational progressions of biomedical research from three dimensions (including overall distribution, time, and research topic), which revealed three significant findings. The proposed measure in this study could be used by policymakers to monitor biomedical research with high translational potential in real time and make better decisions. It can also be adopted and improved for other domains, such as physics or computer science, to assess the application value of scientific discoveries.