iCar-PseCp: identify carbonylation sites in proteins by Monto Carlo sampling and incorporating sequence coupled effects into general PseAAC

被引:175
作者
Jia, Jianhua [1 ,2 ]
Liu, Zi [3 ]
Xiao, Xuan [1 ,2 ]
Liu, Bingxiang [1 ]
Chou, Kuo-Chen [2 ,4 ]
机构
[1] Jing De Zhen Ceram Inst, Dept Comp, Jing De Zhen 333403, Peoples R China
[2] Gordon Life Sci Inst, Boston, MA 02478 USA
[3] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
[4] King Abdulaziz Univ, Ctr Excellence Genom Med Res CEGMR, Jeddah 21589, Saudi Arabia
关键词
carbonylation; sequence-coupling model; PseAAC; monte Carlo sampling; random forest algorithm; AMINO-ACID-COMPOSITION; POSTTRANSLATIONAL MODIFICATION SITES; LYSINE SUCCINYLATION SITES; LABEL LEARNING CLASSIFIER; PROTEASE CLEAVAGE SITES; SUBCELLULAR LOCATION; ENSEMBLE CLASSIFIER; K-TUPLE; PHYSICOCHEMICAL PROPERTIES; WEB-SERVER;
D O I
10.18632/oncotarget.9148
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Carbonylation is a posttranslational modification (PTM or PTLM), where a carbonyl group is added to lysine (K), proline (P), arginine (R), and threonine (T) residue of a protein molecule. Carbonylation plays an important role in orchestrating various biological processes but it is also associated with many diseases such as diabetes, chronic lung disease, Parkinson's disease, Alzheimer's disease, chronic renal failure, and sepsis. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of K, P, R, or T, which ones can be carbonylated, and which ones cannot? To address this problem, we have developed a predictor called iCar-PseCp by incorporating the sequence-coupled information into the general pseudo amino acid composition, and balancing out skewed training dataset by Monte Carlo sampling to expand positive subset. Rigorous target cross-validations on a same set of carbonylation-known proteins indicated that the new predictor remarkably outperformed its existing counterparts. For the convenience of most experimental scientists, a user-friendly web-server for iCar-PseCp has been established at http://www.jci-bioinfo.cn/iCar-PseCp, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved. It has not escaped our notice that the formulation and approach presented here can also be used to analyze many other problems in computational proteomics.
引用
收藏
页码:34558 / 34570
页数:13
相关论文
共 129 条
  • [1] ALTHAUS IW, 1993, J BIOL CHEM, V268, P14875
  • [2] KINETIC-STUDIES WITH THE NONNUCLEOSIDE HIV-1 REVERSE-TRANSCRIPTASE INHIBITOR-U-88204E
    ALTHAUS, IW
    CHOU, JJ
    GONZALES, AJ
    DEIBEL, MR
    CHOU, KC
    KEZDY, FJ
    ROMERO, DL
    PALMER, JR
    THOMAS, RC
    ARISTOFF, PA
    TARPLEY, WG
    REUSSER, F
    [J]. BIOCHEMISTRY, 1993, 32 (26) : 6548 - 6554
  • [3] [Anonymous], J MEMBR BIOL
  • [4] Identification of protein carbonylation sites by two-dimensional liquid chromatography in combination with MALDI- and ESI-MS
    Bollineni, Ravi Ch
    Hoffmann, Ralf
    Fedorova, Maria
    [J]. JOURNAL OF PROTEOMICS, 2011, 74 (11) : 2338 - 2350
  • [5] Proteome-wide profiling of carbonylated proteins and carbonylation sites in HeLa cells under mild oxidative stress conditions
    Bollineni, Ravi Chand
    Hoffmann, Ralf
    Fedorova, Maria
    [J]. FREE RADICAL BIOLOGY AND MEDICINE, 2014, 68 : 186 - 195
  • [6] Modulation of Lon protease activity and aconitase turnover during aging and oxidative stress
    Bota, DA
    Van Remmen, H
    Davies, KJA
    [J]. FEBS LETTERS, 2002, 532 (1-2) : 103 - 106
  • [7] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [8] Support vector machines for predicting membrane protein types by using functional domain composition
    Cai, YD
    Zhou, GP
    Chou, KC
    [J]. BIOPHYSICAL JOURNAL, 2003, 84 (05) : 3257 - 3263
  • [9] The RNA modification database, RNAMDB: 2011 update
    Cantara, William A.
    Crain, Pamela F.
    Rozenski, Jef
    McCloskey, James A.
    Harris, Kimberly A.
    Zhang, Xiaonong
    Vendeix, Franck A. P.
    Fabris, Daniele
    Agris, Paul F.
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D195 - D201
  • [10] propy: a tool to generate various modes of Chou's PseAAC
    Cao, Dong-Sheng
    Xu, Qing-Song
    Liang, Yi-Zeng
    [J]. BIOINFORMATICS, 2013, 29 (07) : 960 - 962