Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou's general PseAAC

被引:220
作者
Dehzangi, Abdollah [1 ,2 ]
Heffernan, Rhys [3 ]
Sharma, Alok [1 ,4 ]
Lyons, James [3 ]
Paliwal, Kuldip [3 ]
Sattar, Abdul [1 ,2 ]
机构
[1] Griffith Univ, IIIS, PO 4111, Brisbane, Qld 4111, Australia
[2] Natl ICT Australia NICTA, Brisbane, Qld, Australia
[3] Griffith Univ, Sch Engn, Brisbane, Qld 4111, Australia
[4] Univ S Pacific, Sch Phys & Engn, Suva, Fiji
关键词
Evolutionary-based features; Segmented autocorrelation; Segmented distribution; Support Vector Machine (SVM); AMINO-ACID-COMPOSITION; LABEL LEARNING CLASSIFIER; FOLD RECOGNITION; MYCOBACTERIAL PROTEINS; ENSEMBLE CLASSIFIER; MEMBRANE-PROTEINS; SCORING MATRIX; WEB SERVER; PREDICTION; LOCATIONS;
D O I
10.1016/j.jtbi.2014.09.029
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Protein subcellular localization is defined as predicting the functioning location of a given protein in the cell. It is considered an important step towards protein function prediction and drug design. Recent studies have shown that relying on Gene Ontology (GO) for feature extraction can improve protein subcellular localization prediction performance. However, relying solely on GO, this problem remains unsolved. At the same time, the impact of other sources of features especially evolutionary-based features has not been explored adequately for this task. In this study, we aim to extract discriminative evolutionary features to tackle this problem. To do this, we propose two segmentation based feature extraction methods to explore potential local evolutionary-based information for Gram-positive and Gram-negative subcellular localizations. We will show that by applying a Support Vector Machine (SVM) classifier to our extracted features, we are able to enhance Gram-positive and Gram-negative subcellular localization prediction accuracies by up to 6.4% better than previous studies including the studies that used GO for feature extraction. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:284 / 294
页数:11
相关论文
共 90 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] [Anonymous], 2013, J. Biomed. Sci. Eng, DOI DOI 10.4236/JBISE.2013.64054
  • [3] Gene Ontology: tool for the unification of biology
    Ashburner, M
    Ball, CA
    Blake, JA
    Botstein, D
    Butler, H
    Cherry, JM
    Davis, AP
    Dolinski, K
    Dwight, SS
    Eppig, JT
    Harris, MA
    Hill, DP
    Issel-Tarver, L
    Kasarskis, A
    Lewis, S
    Matese, JC
    Richardson, JE
    Ringwald, M
    Rubin, GM
    Sherlock, G
    [J]. NATURE GENETICS, 2000, 25 (01) : 25 - 29
  • [4] propy: a tool to generate various modes of Chou's PseAAC
    Cao, Dong-Sheng
    Xu, Qing-Song
    Liang, Yi-Zeng
    [J]. BIOINFORMATICS, 2013, 29 (07) : 960 - 962
  • [5] LIBSVM: A Library for Support Vector Machines
    Chang, Chih-Chung
    Lin, Chih-Jen
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [6] iSS-PseDNC: Identifying Splicing Sites Using Pseudo Dinucleotide Composition
    Chen, Wei
    Feng, Peng-Mian
    Lin, Hao
    Chou, Kuo-Chen
    [J]. BIOMED RESEARCH INTERNATIONAL, 2014, 2014
  • [7] iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition
    Chen, Wei
    Feng, Peng-Mian
    Lin, Hao
    Chou, Kuo-Chen
    [J]. NUCLEIC ACIDS RESEARCH, 2013, 41 (06) : e68
  • [8] iNuc-PhysChem: A Sequence-Based Predictor for Identifying Nucleosomes via Physicochemical Properties
    Chen, Wei
    Lin, Hao
    Feng, Peng-Mian
    Ding, Chen
    Zuo, Yong-Chun
    Chou, Kuo-Chen
    [J]. PLOS ONE, 2012, 7 (10):
  • [9] Chou K., 2010, Nat. Sci, V2, P1090, DOI [10.4236/ns.2010.210136, DOI 10.4236/NS.2010.210136]
  • [10] Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
    Chou, KC
    [J]. BIOINFORMATICS, 2005, 21 (01) : 10 - 19