EuLoc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of Chou's PseAAC

被引:53
作者
Chang, Tzu-Hao [1 ]
Wu, Li-Ching [2 ]
Lee, Tzong-Yi [3 ]
Chen, Shu-Pin [4 ]
Huang, Hsien-Da [5 ,6 ]
Horng, Jorng-Tzong [2 ,4 ,7 ]
机构
[1] Taipei Med Univ, Grad Inst Biomed Informat, Taipei, Taiwan
[2] Natl Cent Univ, Inst Syst Biol & Bioinformat, Chungli 320, Taiwan
[3] Yuan Ze Univ, Dept Comp Sci & Engn, Chungli 320, Taiwan
[4] Natl Cent Univ, Dept Comp Sci & Informat Engn, Chungli 320, Taiwan
[5] Natl Chiao Tung Univ, Inst Bioinformat & Syst Biol, Hsinchu 300, Taiwan
[6] Natl Chiao Tung Univ, Dept Biol Sci & Technol, Hsinchu 300, Taiwan
[7] Asia Univ, Dept Biomed Informat, Wufeng 413, Taiwan
关键词
Subcellular localization; Protein function; Eukaryote; Support vector machine; AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINE; MAXIMAL DEPENDENCE DECOMPOSITION; LOCATION PREDICTION; POSTTRANSLATIONAL MODIFICATIONS; PHOSPHORYLATION SITES; ENSEMBLE CLASSIFIER; MULTI-LOCALIZATION; FUSION CLASSIFIER; EUK-MPLOC;
D O I
10.1007/s10822-012-9628-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The function of a protein is generally related to its subcellular localization. Therefore, knowing its subcellular localization is helpful in understanding its potential functions and roles in biological processes. This work develops a hybrid method for computationally predicting the subcellular localization of eukaryotic protein. The method is called EuLoc and incorporates the Hidden Markov Model (HMM) method, homology search approach and the support vector machines (SVM) method by fusing several new features into Chou's pseudo-amino acid composition. The proposed SVM module overcomes the shortcoming of the homology search approach in predicting the subcellular localization of a protein which only finds low-homologous or non-homologous sequences in a protein subcellular localization annotated database. The proposed HMM modules overcome the shortcoming of SVM in predicting subcellular localizations using few data on protein sequences. Several features of a protein sequence are considered, including the sequence-based features, the biological features derived from PROSITE, NLSdb and Pfam, the post-transcriptional modification features and others. The overall accuracy and location accuracy of EuLoc are 90.5 and 91.2 %, respectively, revealing a better predictive performance than obtained elsewhere. Although the amounts of data of the various subcellular location groups in benchmark dataset differ markedly, the accuracies of 12 subcellular localizations of EuLoc range from 82.5 to 100 %, indicating that this tool is much more balanced than other tools. EuLoc offers a high, balanced predictive power for each subcellular localization. EuLoc is now available on the web at http://euloc.mbc.nctu.edu.tw/.
引用
收藏
页码:91 / 103
页数:13
相关论文
共 109 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], PROTEIN TARGETING
  • [3] [Anonymous], 1999, Proceedings of the International Joint Conference on Artificial Intelligence
  • [4] The universal protein resource (UniProt)
    Bairoch, A
    Apweiler, R
    Wu, CH
    Barker, WC
    Boeckmann, B
    Ferro, S
    Gasteiger, E
    Huang, HZ
    Lopez, R
    Magrane, M
    Martin, MJ
    Natale, DA
    O'Donovan, C
    Redaschi, N
    Yeh, LSL
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : D154 - D159
  • [5] Extensive feature detection of N-terminal protein sorting signals
    Bannai, H
    Tamada, Y
    Maruyama, O
    Nakai, K
    Miyano, S
    [J]. BIOINFORMATICS, 2002, 18 (02) : 298 - 305
  • [6] PSLpred: prediction of subcellular localization of bacterial proteins
    Bhasin, M
    Garg, A
    Raghava, GPS
    [J]. BIOINFORMATICS, 2005, 21 (10) : 2522 - 2524
  • [7] Sequence and structure-based prediction of eukaryotic protein phosphorylation sites
    Blom, N
    Gammeltoft, S
    Brunak, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1999, 294 (05) : 1351 - 1362
  • [8] Protein variety and functional diversity: Swiss-Prot annotation in its biological context
    Boeckmann, B
    Blatter, MC
    Famiglietti, L
    Hinz, U
    Lane, L
    Roechert, B
    Bairoch, A
    [J]. COMPTES RENDUS BIOLOGIES, 2005, 328 (10-11) : 882 - 899
  • [9] Protein transport in organelles: Dual targeting of proteins to mitochondria and chloroplasts
    Carrie, Chris
    Giraud, Estelle
    Whelan, James
    [J]. FEBS JOURNAL, 2009, 276 (05) : 1187 - 1195
  • [10] Chang C., 2001, Software, V80, P604