Biological applications of knowledge graph embedding models

被引:81
作者
Mohamed, Sameh K. [1 ,2 ]
Nounu, Aayah
Novacek, Vit [3 ]
机构
[1] Natl Univ Ireland Galway, Insight Ctr, Comp Sci, Galway, Ireland
[2] Data Sci Inst, Galway, Ireland
[3] Natl Univ Ireland Galway, Data Sci Inst, Biomed Discovery Informat Unit, Galway, Ireland
基金
欧盟地平线“2020”; 爱尔兰科学基金会;
关键词
biomedical knowledge graphs; knowledge graph embeddings; tensor factorization; link prediction; drug-target interactions; polypharmacy side effects; MULTICELLULAR FUNCTION; INTERACTION PREDICTION; DATABASE; NETWORKS; INTEGRATION; DRUGBANK; RESOURCE; DISEASE; DRUGS; GENE;
D O I
10.1093/bib/bbaa012
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Complex biological systems are traditionally modelled as graphs of interconnected biological entities. These graphs, i.e. biological knowledge graphs, are then processed using graph exploratory approaches to perform different types of analytical and predictive tasks. Despite the high predictive accuracy of these approaches, they have limited scalability due to their dependency on time-consuming path exploratory procedures. In recent years, owing to the rapid advances of computational technologies, new approaches for modelling graphs and mining them with high accuracy and scalability have emerged. These approaches, i.e. knowledge graph embedding (KGE) models, operate by learning low-rank vector representations of graph nodes and edges that preserve the graph's inherent structure. These approaches were used to analyse knowledge graphs from different domains where they showed superior performance and accuracy compared to previous graph exploratory approaches. In this work, we study this class of models in the context of biological knowledge graphs and their different applications. We then show how KGE models can be a natural fit for representing complex biological knowledge modelled as graphs. We also discuss their predictive and analytical capabilities in different biology applications. In this regard, we present two example case studies that demonstrate the capabilities of KGE models: prediction of drug-target interactions and polypharmacy side effects. Finally, we analyse different practical considerations for KGEs, and we discuss possible opportunities and challenges related to adopting them for modelling biological systems.
引用
收藏
页码:1679 / 1693
页数:15
相关论文
共 121 条
[1]   Large-scale structural and textual similarity-based mining of knowledge graph to predict drug-drug interactions [J].
Abdelaziz, Ibrahim ;
Fokoue, Achille ;
Hassanzadeh, Oktie ;
Zhang, Ping ;
Sadoghi, Mohammad .
JOURNAL OF WEB SEMANTICS, 2017, 44 :104-117
[2]   Scale-free networks in cell biology [J].
Albert, R .
JOURNAL OF CELL SCIENCE, 2005, 118 (21) :4947-4957
[3]   Neuro-symbolic representation learning on biological knowledge graphs [J].
Alshahrani, Mona ;
Khan, Mohammad Asif ;
Maddouri, Omar ;
Kinjo, Akira R. ;
Queralt-Rosinach, Nuria ;
Hoehndorf, Robert .
BIOINFORMATICS, 2017, 33 (17) :2723-2730
[4]  
Amrouch S., 2012, 2012 INT C INF TECHN, P1, DOI DOI 10.1109/ICITES.2012.6216651
[5]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[6]  
Aronson AR, 2004, STUD HEALTH TECHNOL, V107, P268
[7]   Exome sequencing as a tool for Mendelian disease gene discovery [J].
Bamshad, Michael J. ;
Ng, Sarah B. ;
Bigham, Abigail W. ;
Tabor, Holly K. ;
Emond, Mary J. ;
Nickerson, Deborah A. ;
Shendure, Jay .
NATURE REVIEWS GENETICS, 2011, 12 (11) :745-755
[8]   Network biology:: Understanding the cell's functional organization [J].
Barabási, AL ;
Oltvai, ZN .
NATURE REVIEWS GENETICS, 2004, 5 (02) :101-U15
[9]  
Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkh121, 10.1093/nar/gkp985]
[10]   UniProt: a hub for protein information [J].
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Apweiler, Rolf ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Gane, Paul ;
Cas-tro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightin-gale, Andrew ;
Orchard, Sandra ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier ;
Zellner, Hermann ;
Cowley, Andrew ;
Figueira, Luis ;
Li, Weizhong ;
McWilliam, Hamish .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D204-D212