Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix

被引:14
作者
Chandra, Abel [1 ]
Sharma, Alok [1 ,2 ,3 ,4 ,5 ]
Dehzangi, Abdollah [6 ]
Shigemizu, Daichi [3 ,4 ,5 ,7 ]
Tsunoda, Tatsuhiko [3 ,4 ,5 ,8 ]
机构
[1] Univ South Pacific, Fac Sci Technol & Environm, Sch Engn & Phys, Suva, Fiji
[2] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld 4111, Australia
[3] TMDU, Dept Med Sci Math, Med Res Inst, Tokyo 1138510, Japan
[4] RIKEN Ctr Integrat Med Sci, Lab Med Sci Math, Yokohama, Kanagawa 2300045, Japan
[5] JST, CREST, Tokyo 1028666, Japan
[6] Morgan State Univ, Dept Comp Sci, Baltimore, MD 21239 USA
[7] Natl Ctr Geriatr & Gerontol, Med Genome Ctr, Obu, Aichi 4748511, Japan
[8] Univ Tokyo, Grad Sch Sci, Dept Biol Sci, Lab Med Sci Math, Tokyo 1088639, Japan
关键词
Post-translational modification; Phosphoglycerylation; Lysine residue; Computational technique; Evolutionary information; LYSINE PHOSPHOGLYCERYLATION; SUBCELLULAR-LOCALIZATION; GLUCOSE-METABOLISM; PROTEIN; SITES; INFORMATION; IDENTIFICATION; SVM; PHOSPHORYLATION; OPTIMIZATION;
D O I
10.1186/s12860-019-0240-1
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Background: The biological process known as post-translational modification (PTM) is a condition whereby proteomes are modified that affects normal cell biology, and hence the pathogenesis. A number of PTMs have been discovered in the recent years and lysine phosphoglycerylation is one of the fairly recent developments. Even with a large number of proteins being sequenced in the post-genomic era, the identification of phosphoglycerylation remains a big challenge due to factors such as cost, time consumption and inefficiency involved in the experimental efforts. To overcome this issue, computational techniques have emerged to accurately identify phosphoglycerylated lysine residues. However, the computational techniques proposed so far hold limitations to correctly predict this covalent modification. Results: We propose a new predictor in this paper called Bigram-PGK which uses evolutionary information of amino acids to try and predict phosphoglycerylated sites. The benchmark dataset which contains experimentally labelled sites is employed for this purpose and profile bigram occurrences is calculated from position specific scoring matrices of amino acids in the protein sequences. The statistical measures of this work, such as sensitivity, specificity, precision, accuracy, Mathews correlation coefficient and area under ROC curve have been reported to be 0.9642, 0.8973, 0.8253, 0.9193, 0.8330, 0.9306, respectively. Conclusions: The proposed predictor, based on the feature of evolutionary information and support vector machine classifier, has shown great potential to effectively predict phosphoglycerylated and non-phosphoglycerylated lysine residues when compared against the existing predictors. The data and software of this work can be acquired from https://github.com/abelavit/Bigram-PGK.
引用
收藏
页数:9
相关论文
共 85 条
[1]   14-3-3 proteins: A historic overview [J].
Aitken, Alastair .
SEMINARS IN CANCER BIOLOGY, 2006, 16 (03) :162-172
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]   Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology [J].
Bakhtiarizadeh, Mohammad Reza ;
Moradi-Shahrbabak, Mohammad ;
Ebrahimi, Mansour ;
Ebrahimie, Esmaeil .
JOURNAL OF THEORETICAL BIOLOGY, 2014, 356 :213-222
[4]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[5]   Disorders of glucose metabolism and insulin resistance in patients with obstructive sleep apnoea syndrome [J].
Bulcun, E. ;
Ekici, M. ;
Ekici, A. .
INTERNATIONAL JOURNAL OF CLINICAL PRACTICE, 2012, 66 (01) :91-97
[6]   PhoglyStruct: Prediction of phosphoglycerylated lysine residues using structural properties of amino acids [J].
Chandra, Abel ;
Sharma, Alok ;
Dehzangi, Abdollah ;
Ranganathan, Shoba ;
Jokhan, Anjeela ;
Chou, Kuo-Chen ;
Tsunoda, Tatsuhiko .
SCIENTIFIC REPORTS, 2018, 8
[7]   EvolStruct-Phogly: incorporating structural properties and evolutionary information from profile bigrams for the phosphoglycerylation prediction [J].
Chandra, Abel Avitesh ;
Sharma, Alok ;
Dehzangi, Abdollah ;
Tsunoda, Tatushiko .
BMC GENOMICS, 2019, 19 (Suppl 9)
[8]   iRNA-Methyl: Identifying N6-methyladenosine sites using pseudo nucleotide composition [J].
Chen, Wei ;
Feng, Pengmian ;
Ding, Hui ;
Lin, Hao ;
Chou, Kuo-Chen .
ANALYTICAL BIOCHEMISTRY, 2015, 490 :26-33
[9]   Molecular Characterization of Propionyllysines in Non-histone Proteins [J].
Cheng, Zhongyi ;
Tang, Yi ;
Chen, Yue ;
Kim, Sungchan ;
Liu, Huadong ;
Shawn, S. C. ;
Gu, Wei ;
Zhao, Yingming .
MOLECULAR & CELLULAR PROTEOMICS, 2009, 8 (01) :45-52
[10]   PREDICTION OF PROTEIN STRUCTURAL CLASSES [J].
CHOU, KC ;
ZHANG, CT .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) :275-349