SherLoc2: A High-Accuracy Hybrid Method for Predicting Subcellular Localization of Proteins

被引:97
作者
Briesemeister, Sebastian [1 ]
Blum, Torsten [1 ]
Brady, Scott [2 ]
Lam, Yin [2 ]
Kohlbacher, Oliver [1 ]
Shatkay, Hagit [2 ]
机构
[1] Univ Tubingen, Div Simulat Biol Sci, Ctr Bioinformat Tubingen, D-72074 Tubingen, Germany
[2] Queens Univ, Sch Comp, Kingston, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
protein subcellular localization prediction; machine learning; text mining; Gene Ontology; SUPPORT VECTOR MACHINES; GENE ONTOLOGY TERMS; SEQUENCE; CLASSIFICATION; LOCATION; TEXT; CELL;
D O I
10.1021/pr900665y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
SherLoc2 is a comprehensive high-accuracy subcellular localization prediction system It is applicable to animal, fungal, and plant proteins and covers all main eukaryotic subcellular locations. SherLoc2 integrates several sequence-based features as well as text-based features. In addition, we incorporate phylogenetic profiles and Gene Ontology (GO) terms derived from the protein sequence to considerably improve the prediction performance. SherLoc2 achieves an overall classification accuracy of up to 93% in 5-fold cross-validation. A novel feature, DiaLoc, allows users to manually provide their current background knowledge by describing a protein in a short abstract which is then used to improve the prediction. SherLoc2 is available both as a free Web service and as a stand-alone version at http://www-bsinformatik.uni-tuebingen.de/Services/SherLoc2.
引用
收藏
页码:5363 / 5366
页数:4
相关论文
共 32 条
[1]   Extensive feature detection of N-terminal protein sorting signals [J].
Bannai, H ;
Tamada, Y ;
Maruyama, O ;
Nakai, K ;
Miyano, S .
BIOINFORMATICS, 2002, 18 (02) :298-305
[2]   Improved prediction of signal peptides: SignalP 3.0 [J].
Bendtsen, JD ;
Nielsen, H ;
von Heijne, G ;
Brunak, S .
JOURNAL OF MOLECULAR BIOLOGY, 2004, 340 (04) :783-795
[3]   MultiLoc2: integrating phylogeny and Gene Ontology terms improves subcellular protein localization prediction [J].
Blum, Torsten ;
Briesemeister, Sebastian ;
Kohlbacher, Oliver .
BMC BIOINFORMATICS, 2009, 10 :274
[4]   Prediction of subcellular localization using sequence-biased recurrent networks [J].
Bodén, M ;
Hawkins, J .
BIOINFORMATICS, 2005, 21 (10) :2279-2286
[5]  
Brady Scott, 2008, Pac Symp Biocomput, P604
[6]  
Casadio Rita, 2008, Briefings in Functional Genomics & Proteomics, V7, P63, DOI 10.1093/bfgp/eln003
[7]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[8]   Prediction and classification of protein subcellular location - Sequence-order effect and pseudo amino acid composition [J].
Chou, KC ;
Cai, YD .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2003, 90 (06) :1250-1260
[9]   A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontology [J].
Chou, KC ;
Cai, YD .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2003, 311 (03) :743-747
[10]   Using functional domain composition and support vector machines for prediction of protein subcellular location [J].
Chou, KC ;
Cai, YD .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (48) :45765-45769