Quokka: a comprehensive tool for rapid and accurate prediction of kinase family-specific phosphorylation sites in the human proteome

被引:141
作者
Li, Fuyi [1 ,2 ]
Li, Chen [1 ,2 ,3 ]
Marquez-Lago, Tatiana T. [4 ]
Leier, Andre [4 ]
Akutsu, Tatsuya [5 ]
Purcell, Anthony W. [1 ,2 ]
Smith, A. Ian [1 ,2 ,6 ]
Lithgow, Trevor [1 ,7 ]
Daly, Roger J. [1 ,2 ]
Song, Jiangning [1 ,2 ,8 ]
Chou, Kuo-Chen [9 ]
机构
[1] Monash Univ, Biomed Discovery Inst, Clayton, Vic 3800, Australia
[2] Monash Univ, Dept Biochem & Mol Biol, Clayton, Vic 3800, Australia
[3] Swiss Fed Inst Technol, Inst Mol Syst Biol, Dept Biol, CH-8093 Zurich, Switzerland
[4] Univ Alabama Birmingham, Sch Med, Dept Genet, Birmingham, AL 35294 USA
[5] Kyoto Univ, Inst Chem Res, Bioinformat Ctr, Kyoto 6110011, Japan
[6] Monash Univ, ARC Ctr Excellence Adv Mol Imaging, Melbourne, Vic 3800, Australia
[7] Monash Univ, Dept Microbiol, Clayton, Vic 3800, Australia
[8] Monash Univ, Monash Ctr Data Sci, Clayton, Vic 3800, Australia
[9] Gordon Life Sci Inst, Boston, MA 02478 USA
基金
英国医学研究理事会; 澳大利亚研究理事会; 美国国家卫生研究院; 澳大利亚国家健康与医学研究理事会;
关键词
DNA-DAMAGE RESPONSE; CLASS-I; CELL-DIVISION; SEQUENCE; GLYCOSYLATION; PROMOTERS; PROTEINS; FEATURES; KINOME;
D O I
10.1093/bioinformatics/bty522
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Kinase-regulated phosphorylation is a ubiquitous type of post-translational modification (PTM) in both eukaryotic and prokaryotic cells. Phosphorylation plays fundamental roles in many signalling pathways and biological processes, such as protein degradation and protein-protein interactions. Experimental studies have revealed that signalling defects caused by aberrant phosphorylation are highly associated with a variety of human diseases, especially cancers. In light of this, a number of computational methods aiming to accurately predict protein kinase family-specific or kinase-specific phosphorylation sites have been established, thereby facilitating phosphoproteomic data analysis. Results: In this work, we present Quokka, a novel bioinformatics tool that allows users to rapidly and accurately identify human kinase family-regulated phosphorylation sites. Quokka was developed by using a variety of sequence scoring functions combined with an optimized logistic regression algorithm. We evaluated Quokka based on well-prepared up-to-date benchmark and independent test datasets, curated from the Phospho. ELM and UniProt databases, respectively. The independent test demonstrates that Quokka improves the prediction performance compared with state-of-the-art computational tools for phosphorylation prediction. In summary, our tool provides users with high-quality predicted human phosphorylation sites for hypothesis generation and biological validation.
引用
收藏
页码:4223 / 4231
页数:9
相关论文
共 56 条
[31]   GlycoMinestruct: a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features [J].
Li, Fuyi ;
Li, Chen ;
Revote, Jerico ;
Zhang, Yang ;
Webb, Geoffrey I. ;
Li, Jian ;
Song, Jiangning ;
Lithgow, Trevor .
SCIENTIFIC REPORTS, 2016, 6
[32]   GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome [J].
Li, Fuyi ;
Li, Chen ;
Wang, Mingjun ;
Webb, Geoffrey I. ;
Zhang, Yang ;
Whisstock, James C. ;
Song, Jiangning .
BIOINFORMATICS, 2015, 31 (09) :1411-1419
[33]   Prediction of kinase-specific phosphorylation sites with sequence features by a log-odds ratio approach [J].
Li, Tingting ;
Li, Fei ;
Zhang, Xuegong .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2008, 70 (02) :404-414
[34]   iPro54-PseKNC: a sequence-based predictor for identifying sigma-54 promoters in prokaryote with pseudo k-tuple nucleotide composition [J].
Lin, Hao ;
Deng, En-Ze ;
Ding, Hui ;
Chen, Wei ;
Chou, Kuo-Chen .
NUCLEIC ACIDS RESEARCH, 2014, 42 (21) :12961-12972
[35]   2L-piRNA: A Two-Layer Ensemble Classifier for Identifying Piwi-Interacting RNAs and Their Function [J].
Liu, Bin ;
Yang, Fan ;
Chou, Kuo-Chen .
MOLECULAR THERAPY-NUCLEIC ACIDS, 2017, 7 :267-277
[36]   iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC [J].
Liu, Bin ;
Yang, Fan ;
Huang, De-Shuang ;
Chou, Kuo-Chen .
BIOINFORMATICS, 2018, 34 (01) :33-40
[37]  
Liu B, 2017, BIOINFORMATICS, V33, P35, DOI [10.1093/bioinformatics/btv604, 10.1093/bioinformatics/btw539]
[38]   iPGK-PseAAC: Identify Lysine Phosphoglycerylation Sites in Proteins by Incorporating Four Different Tiers of Amino Acid Pairwise Coupling Information into the General PseAAC [J].
Liu, Li-Ming ;
Xu, Yan ;
Chou, Kuo-Chen .
MEDICINAL CHEMISTRY, 2017, 13 (06) :552-559
[39]   Neuronal MHC Class I Expression Is Regulated by Activity Driven Calcium Signaling [J].
Lv, Dan ;
Shen, Yuqing ;
Peng, Yaqin ;
Liu, Jiane ;
Miao, Fengqin ;
Zhang, Jianqiong .
PLOS ONE, 2015, 10 (08)
[40]   COMPARISON OF PREDICTED AND OBSERVED SECONDARY STRUCTURE OF T4 PHAGE LYSOZYME [J].
MATTHEWS, BW .
BIOCHIMICA ET BIOPHYSICA ACTA, 1975, 405 (02) :442-451