Predicting Protein Relationships to Human Pathways through a Relational Learning Approach Based on Simple Sequence Features

被引:3
作者
Garcia-Jimenez, Beatriz [1 ]
Pons, Tirso [2 ]
Sanchis, Araceli [1 ]
Valencia, Alfonso [2 ]
机构
[1] Univ Carlos III Madrid, Dept Comp Sci, Madrid 28911, Spain
[2] Spanish Natl Canc Res Ctr CNIO, Struct Biol & BioComp Programme, Madrid 28029, Spain
关键词
Pathway relationship prediction; sequence-based prediction; knowledge relational representation; machine learning; function prediction; human reactome pathways; BIOLOGICAL PATHWAYS; SYSTEMS BIOLOGY; LIGASE ACTIVITY; DATA SETS; BIOINFORMATICS; ONTOLOGY; DATABASE;
D O I
10.1109/TCBB.2014.2318730
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Biological pathways are important elements of systems biology and in the past decade, an increasing number of pathway databases have been set up to document the growing understanding of complex cellular processes. Although more genome-sequence data are becoming available, a large fraction of it remains functionally uncharacterized. Thus, it is important to be able to predict the mapping of poorly annotated proteins to original pathway models. Results: We have developed a Relational Learning-based Extension (RLE) system to investigate pathway membership through a function prediction approach that mainly relies on combinations of simple properties attributed to each protein. RLE searches for proteins with molecular similarities to specific pathway components. Using RLE, we associated 383 uncharacterized proteins to 28 pre-defined human Reactome pathways, demonstrating relative confidence after proper evaluation. Indeed, in specific cases manual inspection of the database annotations and the related literature supported the proposed classifications. Examples of possible additional components of the Electron transport system, Telomere maintenance and Integrin cell surface interactions pathways are discussed in detail. Availability: All the human predicted proteins in the 2009 and 2012 releases 30 and 40 of Reactome are available at http://rle.bioinfo.cnio.es.
引用
收藏
页码:753 / 765
页数:13
相关论文
共 58 条
[1]   The public road to high-quality curated biological pathways [J].
Adriaens, Michiel E. ;
Jaillard, Magali ;
Waagmeester, Andra ;
Coort, Susan L. M. ;
Pico, Alex R. ;
Evelo, Chris T. A. .
DRUG DISCOVERY TODAY, 2008, 13 (19-20) :856-862
[2]  
Agrawal R., 1996, ADV KNOWLEDGE DISCOV, V12, P307, DOI DOI 10.1007/978-3-319-31750-2.
[3]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[4]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[5]  
[Anonymous], P 15 INT C MACH LEAR
[6]  
[Anonymous], P 14 INT JT C ART IN
[7]  
[Anonymous], 2014, C4. 5: programs for machine learning
[8]  
[Anonymous], 2001, RELATIONAL DATA MINI
[9]  
[Anonymous], 2006, 23 INT C MACH LEARN, DOI [DOI 10.1145/1143844.1143874, 10.1145/1143844.1143874]
[10]  
[Anonymous], 1997, PROC 10 RES COMPUTAT