SCMBYK: prediction and characterization of bacterial tyrosine-kinases based on propensity scores of dipeptides

被引:15
作者
Vasylenko, Tamara [1 ]
Liou, Yi-Fan [1 ]
Chiou, Po-Chin [1 ]
Chu, Hsiao-Wei [1 ]
Lai, Yung-Sung [1 ]
Chou, Yu-Ling [1 ]
Huang, Hui-Ling [1 ,2 ,3 ]
Ho, Shinn-Ying [1 ,2 ,3 ]
机构
[1] Natl Chiao Tung Univ, Inst Bioinformat & Syst Biol, Hsinchu 300, Taiwan
[2] Natl Chiao Tung Univ, Coll Biol Sci & Technol, Hsinchu 300, Taiwan
[3] Natl Chiao Tung Univ, Ctr Bioinformat Res, Hsinchu, Taiwan
关键词
BY-kinase; Scoring card method; Drug repurposing; Propensity scores; Dipeptide; AMINO-ACID-COMPOSITION; DIFFERENTIAL GEOMETRY; POLYMER CONFORMATION; PHOSPHORYLATION; TUBERCULOSIS; EVOLUTION;
D O I
10.1186/s12859-016-1371-4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Bacterial tyrosine-kinases (BY-kinases), which play an important role in numerous cellular processes, are characterized as a separate class of enzymes and share no structural similarity with their eukaryotic counterparts. However, in silico methods for predicting BY-kinases have not been developed yet. Since these enzymes are involved in key regulatory processes, and are promising targets for anti-bacterial drug design, it is desirable to develop a simple and easily interpretable predictor to gain new insights into bacterial tyrosine phosphorylation. This study proposes a novel SCMBYK method for predicting and characterizing BY-kinases. Results: A dataset consisting of 797 BY-kinases and 783 non-BY-kinases was established to design the SCMBYK predictor, which achieved training and test accuracies of 97.55 and 96.73%, respectively. Furthermore, the leaveone-phylum-out method was used to predict specific bacterial phyla hosts of target sequences, gaining 97.39% average test accuracy. After analyzing SCMBYK-derived propensity scores, four characteristics of BY-kinases were determined: 1) BY-kinases tend to be composed of a-helices; 2) the amino-acid content of extracellular regions of BY-kinases is expected to be dominated by residues such as Val, Ile, Phe and Tyr; 3) BY-kinases structurally resemble nuclear proteins; 4) different domains play different roles in triggering BY-kinase activity. Conclusions: The SCMBYK predictor is an effective method for identification of possible BY-kinases. Furthermore, it can be used as a part of a novel drug repurposing method, which recognizes putative BY-kinases and matches them to approved drugs. Among other results, our analysis revealed that azathioprine could suppress the virulence of M. tuberculosis, and thus be considered as a potential antibiotic for tuberculosis treatment.
引用
收藏
页数:15
相关论文
共 39 条
[1]   Literature mining, ontologies and information visualization for drug repurposing [J].
Andronis, Christos ;
Sharma, Anuj ;
Virvilis, Vassilis ;
Deftereos, Spyros ;
Persidis, Aris .
BRIEFINGS IN BIOINFORMATICS, 2011, 12 (04) :357-368
[2]   Assessing the relationship between conservation of function and conservation of sequence using photosynthetic proteins [J].
Ashkenazi, Shaul ;
Snir, Rotem ;
Ofran, Yanay .
BIOINFORMATICS, 2012, 28 (24) :3203-3210
[3]   Exoproteome and Secretome Derived Broad Spectrum Novel Drug and Vaccine Candidates in Vibrio cholerae Targeted by Piper betel Derived Compounds [J].
Barh, Debmalya ;
Barve, Neha ;
Gupta, Krishnakant ;
Chandra, Sudha ;
Jain, Neha ;
Tiwari, Sandeep ;
Leon-Sicairos, Nidia ;
Canizalez-Roman, Adrian ;
dos Santos, Anderson Rodrigues ;
Hassan, Syed Shah ;
Almeida, Sintia ;
Juca Ramos, Rommel Thiago ;
Carvalho de Abreu, Vinicius Augusto ;
Carneiro, Adriana Ribeiro ;
Soares, Siomar de Castro ;
de Paula Castro, Thiago Luiz ;
Miyoshi, Anderson ;
Silva, Artur ;
Kumar, Anil ;
Misra, Amarendra Narayan ;
Blum, Kenneth ;
Braverman, Eric R. ;
Azevedo, Vasco .
PLOS ONE, 2013, 8 (01)
[4]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[5]   Relation between amino acid composition and cellular location of proteins [J].
Cedano, J ;
Aloy, P ;
PerezPons, JA ;
Querol, E .
JOURNAL OF MOLECULAR BIOLOGY, 1997, 266 (03) :594-600
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]   SCMCRYS: Predicting Protein Crystallization Using an Ensemble Scoring Card Method with Estimating Propensity Scores of P-Collocated Amino Acid Pairs [J].
Charoenkwan, Phasit ;
Shoombuatong, Watshara ;
Lee, Hua-Chin ;
Chaijaruwanich, Jeerayut ;
Huang, Hui-Ling ;
Ho, Shinn-Ying .
PLOS ONE, 2013, 8 (09)
[8]   WebLogo: A sequence logo generator [J].
Crooks, GE ;
Hon, G ;
Chandonia, JM ;
Brenner, SE .
GENOME RESEARCH, 2004, 14 (06) :1188-1190
[9]  
DeLano WL, 2005, ABSTR PAP AM CHEM S, V230, pU1371
[10]   Data mining in bioinformatics using Weka [J].
Frank, E ;
Hall, M ;
Trigg, L ;
Holmes, G ;
Witten, IH .
BIOINFORMATICS, 2004, 20 (15) :2479-2481