Position-Specific Analysis and Prediction of Protein Pupylation Sites Based on Multiple Features

被引:21
作者
Zhao, Xiaowei [1 ,2 ]
Dai, Jiangyan [1 ]
Ning, Qiao [1 ]
Ma, Zhiqiang [1 ,2 ]
Yin, Minghao [2 ]
Sun, Pingping [1 ]
机构
[1] NE Normal Univ, Coll Comp Sci & Informat Technol, Changchun 130117, Peoples R China
[2] NE Normal Univ, Key Lab Intelligent Informat Proc Jilin Univ, Changchun 130117, Peoples R China
关键词
ENSEMBLE CLASSIFIER; FEATURE-SELECTION; CD-HIT; UBIQUITIN; PUP; IDENTIFICATION; LOCATION; UBIQUITYLATION; DATABASE; PROGRESS;
D O I
10.1155/2013/109549
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Pupylation is one of the most important posttranslational modifications of proteins; accurate identification of pupylation sites will facilitate the understanding of the molecular mechanism of pupylation. Besides the conventional experimental approaches, computational prediction of pupylation sites is much desirable for their convenience and fast speed. In this study, we developed a novel predictor to predict the pupylation sites. First, the maximum relevance minimum redundancy (mRMR) and incremental feature selection methods were made on five kinds of features to select the optimal feature set. Then the prediction model was built based on the optimal feature set with the assistant of the support vector machine algorithm. As a result, the overall jackknife success rate by the new predictor on a newly constructed benchmark dataset was 0.764, and the Mathews correlation coefficient was 0.522, indicating a good prediction. Feature analysis showed that all features types contributed to the prediction of protein pupylation sites. Further site-specific features analysis revealed that the features of sites surrounding the central lysine contributed more to the determination of pupylation sites than the other sites.
引用
收藏
页数:9
相关论文
共 45 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[3]   Reconstitution of the Mycobacterium tuberculosis pupylation pathway in Escherichia coli [J].
Cerda-Maira, Francisca A. ;
McAllister, Fiona ;
Bode, Nadine J. ;
Burns, Kristin E. ;
Gygi, Steven P. ;
Darwin, K. Heran .
EMBO REPORTS, 2011, 12 (08) :863-870
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]   SCRATCH: a protein structure and structural feature prediction server [J].
Cheng, J ;
Randall, AZ ;
Sweredoski, MJ ;
Baldi, P .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W72-W76
[6]   PREDICTION OF PROTEIN STRUCTURAL CLASSES [J].
CHOU, KC ;
ZHANG, CT .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) :275-349
[7]   Cell-PLoc: a package of Web servers for predicting subcellular localization of proteins in various organisms [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
NATURE PROTOCOLS, 2008, 3 (02) :153-162
[8]   Recent progress in protein subcellular location prediction [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
ANALYTICAL BIOCHEMISTRY, 2007, 370 (01) :1-16
[9]   Large-scale plant protein subcellular location prediction [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
JOURNAL OF CELLULAR BIOCHEMISTRY, 2007, 100 (03) :665-678
[10]   Hum-PLoc: A novel ensemble classifier for predicting human protein subcellular localization [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2006, 347 (01) :150-157