PG1cS: Prediction of protein O-GlcNAcylation sites with multiple features and analysis

被引：14

作者：

Zhao, Xiaowei ^{[1
]}

Ning, Qiao ^{[1
]}

Chai, Haiting ^{[1
]}

Ai, Meiyue ^{[1
]}

Ma, Zhiqiang ^{[1
]}

机构：

[1] NE Normal Univ, Sch Comp Sci & Informat Technol, Changchun 130117, Peoples R China

来源：

JOURNAL OF THEORETICAL BIOLOGY | 2015年 / 380卷

基金：

中国国家自然科学基金;

关键词：

O-GlcNAcylated mechanisms; Support vector machines; A two-step feature selection; k-means cluster; AMINO-ACID-COMPOSITION; S-NITROSYLATION SITES; REMOTE HOMOLOGY DETECTION; SEQUENCE-BASED PREDICTOR; PHYSICOCHEMICAL PROPERTIES; ENSEMBLE CLASSIFIER; GENERAL-FORM; PSEAAC; IDENTIFICATION; MODES;

D O I：

10.1016/j.jtbi.2015.06.026

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

As a widespread type of protein post-translational modification, O-GIcNAcylation plays crucial regulatory roles in almost all cellular processes and is related to some diseases. To deeply understand O-GlcNAcylated mechanisms, identification of substrates and specific O-GlcNAcylated sites is crucial. Experimental identification is expensive and time-consuming, so computational prediction of O-GIcNAcylated sites has considerable value. In this work, we developed a novel O-GIcNAcylated sites predictor called PGIcS (Prediction of O-GlcNAcylated Sites) by using k-means cluster to obtain informative and reliable negative samples, and support vector machines classifier combined with a two-step feature selection. The performance of PGIcS was evaluated using an independent testing dataset resulting in a sensitivity of 64.62%, a specificity of 68.4%, an accuracy of 68.37%, and a Matthew's correlation coefficient of 0.0697, which demonstrated PGlcS was very promising for predicting O-GlcNAcylated sites. The datasets and source code were available in Supplementary information. (C) 2015 Elsevier Ltd. All rights reserved.

引用

页码：524 / 529

页数：6

共 71 条

[1]

[Anonymous], 2015, MOL GENET GENOMICS

[2]

[Anonymous], BIOCHEMISTRY

[3]

[Anonymous], 2014, SCI REP-UK, DOI DOI 10.1038/SREP07186

[4]

[Anonymous], 2011, Acm T. Intel. Syst. Tec., DOI [DOI 10.1145/1961189.1961199, 10. 1145/1961189.1961199]

[5] New consensus features for tyrosine O-sulfation determined by mutational analysis [J].

Bundgaard, JR ;

Vuust, J ;

Rehfeld, JF .

JOURNAL OF BIOLOGICAL CHEMISTRY, 1997, 272 (35) :21700-21705

[6] propy: a tool to generate various modes of Chou's PseAAC [J].

Cao, Dong-Sheng ;

Xu, Qing-Song ;

Liang, Yi-Zeng .

BIOINFORMATICS, 2013, 29 (07) :960-962

[7] Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences [J].

Chen, Peng ;

Li, Jinyan ;

Wong, Limsoon ;

Kuwahara, Hiroyuki ;

Huang, Jianhua Z. ;

Gao, Xin .

PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2013, 81 (08) :1351-1362

[8] iRSpot-PseDNC: identify recombination spots with pseudo dinucleotide composition [J].

Chen, Wei ;

Feng, Peng-Mian ;

Lin, Hao ;

Chou, Kuo-Chen .

NUCLEIC ACIDS RESEARCH, 2013, 41 (06) :e68

[9] Incorporating key position and amino acid residue features to identify general and species-specific Ubiquitin conjugation sites [J].

Chen, Xiang ;

Qiu, Jian-Ding ;

Shi, Shao-Ping ;

Suo, Sheng-Bao ;

Huang, Shu-Yun ;

Liang, Ru-Ping .

BIOINFORMATICS, 2013, 29 (13) :1614-1622

[10] Prediction of Ubiquitination Sites by Using the Composition of k-Spaced Amino Acid Pairs [J].

Chen, Zhen ;

Chen, Yong-Zi ;

Wang, Xiao-Feng ;

Wang, Chuan ;

Yan, Ren-Xiang ;

Zhang, Ziding .

PLOS ONE, 2011, 6 (07)

← 1 2 3 4 5 6 7 8 →