pLoc-mPlant: predict subcellular localization of multi-location plant proteins by incorporating the optimal GO information into general PseAAC

被引:172
作者
Cheng, Xiang [1 ]
Xiao, Xuan [1 ,2 ]
Chou, Kuo-Chen [2 ,3 ]
机构
[1] Jingdezhen Ceram Inst, Comp Dept, Jingdezhen, Peoples R China
[2] Gordon Life Sci Inst, Boston, MA 02478 USA
[3] Univ Elect Sci & Technol China, Ctr Informat Biol, Chengdu 610054, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
AMINO-ACID-COMPOSITION; MULTI-LABEL CLASSIFIER; ENZYME SUBFAMILY CLASSES; SUPPORT VECTOR MACHINE; ENSEMBLE CLASSIFIER; ANTIMICROBIAL PEPTIDES; DIPEPTIDE COMPOSITION; LEARNING CLASSIFIER; LOCATION PREDICTION; MEMBRANE-PROTEINS;
D O I
10.1039/c7mb00267j
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
One of the fundamental goals in cellular biochemistry is to identify the functions of proteins in the context of compartments that organize them in the cellular environment. To realize this, it is indispensable to develop an automated method for fast and accurate identification of the subcellular locations of uncharacterized proteins. The current study is focused on plant protein subcellular location prediction based on the sequence information alone. Although considerable efforts have been made in this regard, the problem is far from being solved yet. Most of the existing methods can be used to deal with single-location proteins only. Actually, proteins with multi-locations may have some special biological functions. This kind of multiplex protein is particularly important for both basic research and drug design. Using the multi-label theory, we present a new predictor called "pLoc-mPlant" by extracting the optimal GO (Gene Ontology) information into the Chou's general PseAAC (Pseudo Amino Acid Composition). Rigorous cross-validation on the same stringent benchmark dataset indicated that the proposed pLoc-mPlant predictor is remarkably superior to iLoc-Plant, the state-of-the-art method for predicting plant protein subcellular localization. To maximize the convenience of most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc-mPlant/, by which users can easily get their desired results without the need to go through the complicated mathematics involved.
引用
收藏
页码:1722 / 1727
页数:6
相关论文
共 64 条
  • [1] Identification of Heat Shock Protein families and J-protein types by incorporating Dipeptide Composition into Chou's general PseAAC
    Ahmad, Saeed
    Kabir, Muhammad
    Hayat, Maqsood
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2015, 122 (02) : 165 - 174
  • [2] Classification of membrane protein types using Voting Feature Interval in combination with Chou's Pseudo Amino Acid Composition
    Ali, Farman
    Hayat, Maqsood
    [J]. JOURNAL OF THEORETICAL BIOLOGY, 2015, 384 : 78 - 83
  • [3] [Anonymous], ONCOTARGET
  • [4] [Anonymous], PLOS ONE
  • [5] dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation
    Chen, Junjie
    Long, Ren
    Wang, Xiao-long
    Liu, Bin
    Chou, Kuo-Chen
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [6] iRNA-AI: identifying the adenosine to inosine editing sites in RNA sequences
    Chen, Wei
    Feng, Pengmian
    Yang, Hui
    Ding, Hui
    Lin, Hao
    Chou, Kuo-Chen
    [J]. ONCOTARGET, 2017, 8 (03) : 4208 - 4217
  • [7] Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences
    Chen, Wei
    Lin, Hao
    Chou, Kuo-Chen
    [J]. MOLECULAR BIOSYSTEMS, 2015, 11 (10) : 2620 - 2634
  • [8] iATC-mISF: a multi-label classifier for predicting the classes of anatomical therapeutic chemicals
    Cheng, Xiang
    Zhao, Shu-Guang
    Xiao, Xuan
    Chou, Kuo-Chen
    [J]. BIOINFORMATICS, 2017, 33 (03) : 341 - 346
  • [9] Chou K.C., 2010, PLOS ONE, V5, pe9931, DOI DOI 10.1371/journal.pone.0011335
  • [10] Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
    Chou, KC
    [J]. BIOINFORMATICS, 2005, 21 (01) : 10 - 19