A top-down approach to enhance the power of predicting human protein subcellular localization: Hum-mPLoc 2.0

被引:145
作者
Shen, Hong-Bin [1 ]
Chou, Kuo-Chen
机构
[1] Shanghai Jiao Tong Univ, Inst Image Proc & Pattern Recognit, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Multiplex protein; Homology search; Representative proteins; Gene ontology; Functional domain; Sequential evolution; Ensemble classifier; Fusion approach; AMINO-ACID-COMPOSITION; SUPPORT VECTOR MACHINES; FUNCTIONAL DOMAIN COMPOSITION; SEQUENTIAL EVOLUTION INFORMATION; LOCATION PREDICTION; QUATERNARY STRUCTURE; GENE ONTOLOGY; WEB SERVER; ENSEMBLE CLASSIFIER; STRUCTURAL CLASSES;
D O I
10.1016/j.ab.2009.07.046
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Predicting subcellular localization of human proteins is a challenging problem, particularly when query proteins may have a multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. In a previous study, we developed a predictor called "Hum-mPLoc" to deal with the multiplex problem for the human protein system. However, Hum-mPLoc has the following shortcomings. (1) The input of accession number for a query protein is required in order to obtain a higher expected success rate by selecting to use the higher-level prediction pathway; but many proteins, such as synthetic and hypothetical proteins as well as those newly discovered proteins without being deposited into databanks yet, do not have accession numbers. (2) Neither functional domain nor sequential evolution information were taken into account in Hum-mPLoc, and hence its power may be reduced accordingly. In view of this, a top-down strategy to address these shortcomings has been implemented. The new predictor thus obtained is called Hum-mPLoc 2.0, where the accession number for input is no longer needed whatsoever. Moreover, both the functional domain information and the sequential evolution information have been fused into the predictor by an ensemble classifier. As a consequence, the prediction power has been significantly enhanced. The web server of Hum-mPLoc2.0 is freely accessible at http://www.csbio.sjtu.edu.cn/bioinf/hum-multi-2/. (C) 2009 Elsevier Inc. All rights reserved.
引用
收藏
页码:269 / 274
页数:6
相关论文
共 55 条
  • [1] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [2] The GOA database in 2009-an integrated Gene Ontology Annotation resource
    Barrell, Daniel
    Dimmer, Emily
    Huntley, Rachael P.
    Binns, David
    O'Donovan, Claire
    Apweiler, Rolf
    [J]. NUCLEIC ACIDS RESEARCH, 2009, 37 : D396 - D403
  • [3] Support vector machines for predicting membrane protein types by using functional domain composition
    Cai, YD
    Zhou, GP
    Chou, KC
    [J]. BIOPHYSICAL JOURNAL, 2003, 84 (05) : 3257 - 3263
  • [4] The gene ontology annotation (GOA) project: Implementation of GO in SWISS-PROT, TrEMBL, and InterPro
    Camon, E
    Magrane, M
    Barrell, D
    Binns, D
    Fleischmann, W
    Kersey, P
    Mulder, N
    Oinn, T
    Maslen, J
    Cox, A
    Apweiler, R
    [J]. GENOME RESEARCH, 2003, 13 (04) : 662 - 672
  • [5] Prediction of Protein Secondary Structure Content by Using the Concept of Chou's Pseudo Amino Acid Composition and Support Vector Machine
    Chen, Chao
    Chen, Lixuan
    Zou, Xiaoyong
    Cai, Peixiang
    [J]. PROTEIN AND PEPTIDE LETTERS, 2009, 16 (01) : 27 - 31
  • [6] Chou K.C., 2009, OPEN BIOINFORM J, V3, P31
  • [7] Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
    Chou, KC
    [J]. BIOINFORMATICS, 2005, 21 (01) : 10 - 19
  • [8] Structural bioinformatics and its impact to biomedical science
    Chou, KC
    [J]. CURRENT MEDICINAL CHEMISTRY, 2004, 11 (16) : 2105 - 2134
  • [9] Using functional domain composition and support vector machines for prediction of protein subcellular location
    Chou, KC
    Cai, YD
    [J]. JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (48) : 45765 - 45769
  • [10] Protein subcellular location prediction
    Chou, KC
    Elrod, DW
    [J]. PROTEIN ENGINEERING, 1999, 12 (02): : 107 - 118