SUMOgo: Prediction of sumoylation sites on lysines by motif screening models and the effects of various post-translational modifications

被引:30
作者
Chang, Chi-Chang [1 ,2 ]
Tung, Chi-Hua [3 ]
Chen, Chi-Wei [4 ,5 ]
Tu, Chin-Hau [5 ]
Chu, Yen-Wei [5 ,6 ]
机构
[1] Chung Shan Med Univ, Sch Med Informat, Taichung, Taiwan
[2] Chung Shan Med Univ Hosp, IT Off, Taichung, Taiwan
[3] Chung Hua Univ, Dept Bioinformat, Rm S116,707,Sec 2,WuFu Rd, Hsinchu 30012, Taiwan
[4] Natl Chung Hsing Univ, Dept Comp Sci & Engn, 250 Kuo Kuang Rd, Taichung 402, Taiwan
[5] Natl Chung Hsing Univ, Inst Genom & Bioinformat, 250 Kuo Kuang Rd, Taichung 402, Taiwan
[6] Natl Chung Hsing Univ, Inst Mol Biol, Agr Biotechnol Ctr, Biotechnol Ctr, 250 Kuo Kuang Rd, Taichung 402, Taiwan
关键词
WEB SERVER; CD-HIT; PROTEIN; RESOURCE; BINDING; TOOL;
D O I
10.1038/s41598-018-33951-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Most modern tools used to predict sites of small ubiquitin-like modifier (SUMO) binding (referred to as SUMOylation) use algorithms, chemical features of the protein, and consensus motifs. However, these tools rarely consider the influence of post-translational modification (PTM) information for other sites within the same protein on the accuracy of prediction results. This study applied the Random Forest machine learning method, as well as motif screening models and a feature selection combination mechanism, to develop a SUMOylation prediction system, referred to as SUMOgo. With regard to prediction method, PTM sites were coded as new functional features in addition to structural features, such as sequence-based binary coding, encoded chemical features of proteins, and encoded secondary structure information that is important for PTM. Twenty cycles of prediction were conducted with a 1: 1 combination of positive test data and random negative data. Matthew's correlation coefficient of SUMOgo reached 0.511, which is higher than that of current commonly used tools. This study further verified the important role of PTM in SUMOgo and includes a case study on CREB binding protein (CREBBP). The website for the final tool is http://predictor.nchu.edu.tw/SUMOgo.
引用
收藏
页数:10
相关论文
共 39 条
[1]   Reorganizing the protein space at the Universal Protein Resource (UniProt) [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Antunes, Ricardo ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bower, Lawrence ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Da Silva, Alan ;
Dimmer, Emily ;
Eberhardt, Ruth ;
Fazzini, Francesco ;
Fedotov, Alexander ;
Garavelli, John ;
Castro, Leyla Garcia ;
Gardner, Michael ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pontikos, Nikolas ;
Pundir, Sangya ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Wardell, Tony ;
Watkins, Xavier ;
Corbett, Matt ;
Donnelly, Mike ;
van Rensburg, Pieter ;
Goujon, Mickael ;
McWilliam, Hamish ;
Lopez, Rodrigo ;
Xenarios, Ioannis ;
Bougueleret, Lydie ;
Bridge, Alan ;
Poux, Sylvain ;
Redaschi, Nicole .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D71-D75
[2]   Solving the protein sequence metric problem [J].
Atchley, WR ;
Zhao, JP ;
Fernandes, AD ;
Drüke, T .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2005, 102 (18) :6395-6400
[3]  
Azevedo Cristina, 2015, Advances in Biological Regulation, V60, P144, DOI 10.1016/j.jbior.2015.09.008
[4]   JASSA: a comprehensive tool for prediction of SUMOylation sites and SIMs [J].
Beauclair, Guillaume ;
Bridier-Nahmias, Antoine ;
Zagury, Jean-Francois ;
Saib, Ali ;
Zamborlini, Alessia .
BIOINFORMATICS, 2015, 31 (21) :3483-3491
[5]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[6]   PHOSPHORYLATED CREB BINDS SPECIFICALLY TO THE NUCLEAR-PROTEIN CBP [J].
CHRIVIA, JC ;
KWOK, RPS ;
LAMB, N ;
HAGIWARA, M ;
MONTMINY, MR ;
GOODMAN, RH .
NATURE, 1993, 365 (6449) :855-859
[7]   Data mining in bioinformatics using Weka [J].
Frank, E ;
Hall, M ;
Trigg, L ;
Holmes, G ;
Witten, IH .
BIOINFORMATICS, 2004, 20 (15) :2479-2481
[8]   Concepts in sumoylation: a decade on [J].
Geiss-Friedlander, Ruth ;
Melchior, Frauke .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2007, 8 (12) :947-956
[9]  
Green JR, 2006, CAN MED BIOL ENG C, DOI [10.13140/2.1.1621.3446, DOI 10.13140/2.1.1621.3446]
[10]   SUMO: A history of modification [J].
Hay, RT .
MOLECULAR CELL, 2005, 18 (01) :1-12