MedGraph: A semantic biomedical information retrieval framework using knowledge graph embedding for PubMed

被引:4
作者
Ebeid, Islam Akef [1 ]
机构
[1] Univ Arkansas, Dept Informat Sci, Little Rock, AR 72204 USA
来源
FRONTIERS IN BIG DATA | 2022年 / 5卷
基金
美国国家卫生研究院;
关键词
knowledge graph; natural language processing; information retrieval; biomedical digital libraries; graph embedding;
D O I
10.3389/fdata.2022.965619
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Here we study the semantic search and retrieval problem in biomedical digital libraries. First, we introduce MedGraph, a knowledge graph embedding-based method that provides semantic relevance retrieval and ranking for the biomedical literature indexed in PubMed. Second, we evaluate our approach using PubMed's Best Match algorithm. Moreover, we compare our method MedGraph to a traditional TF-IDF-based algorithm. Third, we use a dataset extracted from PubMed, including 30 million articles' metadata such as abstracts, author information, citation information, and extracted biological entity mentions. We putt a subset of the dataset to evaluate MedGraph using predefined queries with ground truth ranked results. To our knowledge, this technique has not been explored before in biomedical information retrieval. In addition, our results provide some evidence that semantic approaches to search and relevance in biomedical digital libraries that rely on knowledge graph modeling offer better search relevance results when compared with traditional methods in terms of objective metrics.
引用
收藏
页数:15
相关论文
共 41 条
[1]  
[Anonymous], 1999, P WEB C
[2]  
[Anonymous], 2018, Torch Machine Learning Library
[3]  
Aslam J. A., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P601, DOI 10.1145/1148170.1148275
[4]   The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities [J].
Berners-Lee, T ;
Hendler, J ;
Lassila, O .
SCIENTIFIC AMERICAN, 2001, 284 (05) :34-+
[5]   Graph-based term weighting for information retrieval [J].
Blanco, Roi ;
Lioma, Christina .
INFORMATION RETRIEVAL, 2012, 15 (01) :54-92
[6]   The Unified Medical Language System (UMLS): integrating biomedical terminology [J].
Bodenreider, O .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D267-D270
[7]  
Bordes A, 2013, P NEUR INF PROC SYST, P1
[8]  
Busa-Fekete R., 2012, ECAI 2012 20 EUROPEA, P1
[9]   Chem2Bio2RDF: a semantic framework for linking and data mining chemogenomic and systems chemical biology data [J].
Chen, Bin ;
Dong, Xiao ;
Jiao, Dazhi ;
Wang, Huijun ;
Zhu, Qian ;
Ding, Ying ;
Wild, David J. .
BMC BIOINFORMATICS, 2010, 11
[10]  
Ebeid I. A., 2021, P INT C INF, P112