HPO2GO: prediction of human phenotype ontology term associations for proteins using cross ontology annotation co-occurrences

被引:23
作者
Dogan, Tunca [1 ,2 ,3 ]
机构
[1] Middle East Tech Univ, Grad Sch Informat, Dept Hlth Informat, Ankara, Turkey
[2] Middle East Tech Univ, Grad Sch Informat, Canc Syst Biol Lab KanSiL, Ankara, Turkey
[3] EBI, EMBL, Cambridge, England
来源
PEERJ | 2018年 / 6卷
关键词
Human phenotype ontology; Gene ontology; Cross ontology mapping; Ontological term prediction; Statistical resampling; Predictive performance analysis; SEMANTIC SIMILARITY; FANCONI-ANEMIA; DATABASE; CLASSIFICATION; DISEASES; FAMILY; TOOL;
D O I
10.7717/peerj.5298
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Analysing the relationships between biomolecules and the genetic diseases is a highly active area of research, where the aim is to identify the genes and their products that cause a particular disease due to functional changes originated from mutations. Biological ontologies are frequently employed in these studies, which provides researchers with extensive opportunities for knowledge discovery through computational data analysis. In this study, a novel approach is proposed for the identification of relationships between biomedical entities by automatically mapping phenotypic abnormality defining HPO terms with biomolecular function defining GO terms, where each association indicates the occurrence of the abnormality due to the loss of the biomolecular function expressed by the corresponding GO term. The proposed HPO2GO mappings were extracted by calculating the frequency of the co-annotations of the terms on the same genes/proteins, using already existing curated HPO and GO annotation sets. This was followed by the filtering of the unreliable mappings that could be observed due to chance, by statistical resampling of the co-occurrence similarity distributions. Furthermore, the biological relevance of the finalized mappings were discussed over selected cases, using the literature. The resulting HPO2GO mappings can be employed in different settings to predict and to analyse novel gene/protein-ontology term-disease relations. As an application of the proposed approach, HPO term-protein associations (i.e., HPO2protein) were predicted. In order to test the predictive performance of the method on a quantitative basis, and to compare it with the state-of-the-art, CAFA2 challenge HPO prediction target protein set was employed. The results of the benchmark indicated the potential of the proposed approach, as HPO2GO performance was among the best (Fmax = 0.35). The automated cross ontology mapping approach developed in this work may be extended to other ontologies as well, to identify unexplored relation patterns at the systemic level. The datasets, results and the source code of HPO2GO are available for download at: https://github.com/cansyl/HPO2GO.
引用
收藏
页数:33
相关论文
共 50 条
[1]   A shortest-path graph kernel for estimating gene product semantic similarity [J].
Alvarez, Marco A. ;
Qi, Xiaojun ;
Yan, Changhui .
JOURNAL OF BIOMEDICAL SEMANTICS, 2011, 2
[2]   OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders [J].
Amberger, Joanna S. ;
Bocchini, Carol A. ;
Schiettecatte, Francois ;
Scott, Alan F. ;
Hamosh, Ada .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D789-D798
[3]   Integrating phenotype ontologies with PhenomeNET [J].
Angel Rodriguez-Garcia, Miguel ;
Gkoutos, Georgios V. ;
Schofield, Paul N. ;
Hoehndorf, Robert .
JOURNAL OF BIOMEDICAL SEMANTICS, 2017, 8
[4]  
[Anonymous], 2013, NONPARAMETRIC STAT M
[5]  
[Anonymous], 2005, GENOME BIOL
[6]  
[Anonymous], 2014, P ISMB
[7]   PONTOCEREBELLAR HYPOPLASIAS - AN OVERVIEW OF A GROUP OF INHERITED NEURODEGENERATIVE DISORDERS WITH FETAL ONSET [J].
BARTH, PG .
BRAIN & DEVELOPMENT, 1993, 15 (06) :411-422
[8]   UniProt: the universal protein knowledgebase [J].
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Bye-A-Jee, Hema ;
Cowley, Andrew ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Castro, Leyla Garcia ;
Figueira, Luis ;
Garmiri, Penelope ;
Georghiou, George ;
Gonzalez, Daniel ;
Hatton-Ellis, Emma ;
Li, Weizhong ;
Liu, Wudong ;
Lopez, Rodrigo ;
Luo, Jie ;
Lussi, Yvonne ;
MacDougall, Alistair ;
Nightingale, Andrew ;
Palka, Barbara ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Speretta, Elena ;
Turner, Edward ;
Tyagi, Nidhi ;
Volynkin, Vladimir ;
Wardell, Tony ;
Warner, Kate ;
Watkins, Xavier ;
Zaru, Rossana ;
Zellner, Hermann ;
Xenarios, Ioannis .
NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) :D158-D169
[9]   Chapter 15: Disease Gene Prioritization [J].
Bromberg, Yana .
PLOS COMPUTATIONAL BIOLOGY, 2013, 9 (04)
[10]   tRNA splicing endonuclease mutations cause pontocerebellar hypoplasia [J].
Budde, Birgit S. ;
Namavar, Yasmin ;
Barth, Peter G. ;
Poll-The, Bwee Tien ;
Nuernberg, Gudrun ;
Becker, Christian ;
van Ruissen, Fred ;
Weterman, Marian A. J. ;
Fluiter, Kees ;
Beek, Erik T. te ;
Aronica, Eleonora ;
van der Knaap, Marjo S. ;
Hoehne, Wolfgang ;
Toliat, Mohammad Reza ;
Crow, Yanick J. ;
Steinlin, Maja ;
Voit, Thomas ;
Roelens, Filip ;
Brussel, Wim ;
Brockmann, Knut ;
Kyllerman, Marten ;
Boltshauser, Eugen ;
Hammersen, Gerhard ;
Willemsen, Michel ;
Basel-Vanagaite, Lina ;
Kraegeloh-Mann, Ingeborg ;
de Vries, Linda S. ;
Sztriha, Laszlo ;
Muntoni, Francesco ;
Ferrie, Colin D. ;
Battini, Roberta ;
Hennekam, Raoul C. M. ;
Grillo, Eugenio ;
Beemer, Frits A. ;
Stoets, Loes M. E. ;
Wollnik, Bernd ;
Nuernberg, Peter ;
Baas, Frank .
NATURE GENETICS, 2008, 40 (09) :1113-1118