Prediction of Interactions between Viral and Host Proteins Using Supervised Machine Learning Methods

被引:61
作者
Barman, Ranjan Kumar [1 ]
Saha, Sudipto [3 ]
Das, Santasabuj [1 ,2 ]
机构
[1] Natl Inst Cholera & Enter Dis, Biomed Informat Ctr, Kolkata, W Bengal, India
[2] Natl Inst Cholera & Enter Dis, Div Clin Med, Kolkata, W Bengal, India
[3] Bose Inst, Bioinformat Ctr, Kolkata, W Bengal, India
关键词
HEPATITIS-E VIRUS; INFORMATION; SEQUENCES; NETWORK; SWINE; MODEL; INDIA;
D O I
10.1371/journal.pone.0112034
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: Viral-host protein-protein interaction plays a vital role in pathogenesis, since it defines viral infection of the host and regulation of the host proteins. Identification of key viral-host protein-protein interactions (PPIs) has great implication for therapeutics. Methods: In this study, a systematic attempt has been made to predict viral-host PPIs by integrating different features, including domain-domain association, network topology and sequence information using viral-host PPIs from VirusMINT. The three well-known supervised machine learning methods, such as SVM, Naive Bayes and Random Forest, which are commonly used in the prediction of PPIs, were employed to evaluate the performance measure based on five-fold cross validation techniques. Results: Out of 44 descriptors, best features were found to be domain-domain association and methionine, serine and valine amino acid composition of viral proteins. In this study, SVM-based method achieved better sensitivity of 67% over Naive Bayes (37.49%) and Random Forest (55.66%). However the specificity of Naive Bayes was the highest (99.52%) as compared with SVM (74%) and Random Forest (89.08%). Overall, the SVM and Random Forest achieved accuracy of 71% and 72.41%, respectively. The proposed SVM-based method was evaluated on blind dataset and attained a sensitivity of 64%, specificity of 83%, and accuracy of 74%. In addition, unknown potential targets of hepatitis B virus-human and hepatitis E virus-human PPIs have been predicted through proposed SVM model and validated by gene ontology enrichment analysis. Our proposed model shows that, hepatitis B virus "C protein'' binds to membrane docking protein, while "X protein'' and "P protein'' interacts with cell-killing and metabolic process proteins, respectively. Conclusion: The proposed method can predict large scale interspecies viral-human PPIs. The nature and function of unknown viral proteins (HBV and HEV), interacting partners of host protein were identified using optimised SVM model.
引用
收藏
页数:10
相关论文
共 34 条
[1]  
[Anonymous], 2009, WEKA DATA MINING SOF
[2]   Update on activities at the Universal Protein Resource (UniProt) in 2013 [J].
Apweiler, Rolf ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Alam-Faruque, Yasmin ;
Alpi, Emanuela ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Casanova, Elisabet Barrera ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chan, Wei Mun ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dimmer, Emily ;
Fazzini, Francesco ;
Gane, Paul ;
Fedotov, Alexander ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Jacobsen, Julius ;
Jones, Rachel ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightingale, Andrew ;
Orchard, Sandra ;
Patient, Samuel ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Sawford, Tony ;
Sehra, Harminder ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D43-D47
[3]   Computational analysis of interactomes: Current and future perspectives for bioinformatics approaches to model the host-pathogen interaction space [J].
Arnold, Roland ;
Boonen, Kurt ;
Sun, Mark G. F. ;
Kim, Philip M. .
METHODS, 2012, 57 (04) :508-518
[4]  
Begum N, 2010, INDIAN J MED RES, V132, P504
[5]   Choosing negative examples for the prediction of protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BMC BIOINFORMATICS, 2006, 7 (Suppl 1)
[6]   Identification of genotype 1 hepatitis E virus in samples from swine in Cambodia [J].
Caron, M. ;
Enouf, V. ;
Than, S. C. ;
Dellamonica, L. ;
Buisson, Y. ;
Nicand, E. .
JOURNAL OF CLINICAL MICROBIOLOGY, 2006, 44 (09) :3440-3442
[7]   VirusMINT: a viral protein interaction database [J].
Chatr-aryamontri, Andrew ;
Ceol, Arnaud ;
Peluso, Daniele ;
Nardozza, Aurelio ;
Panni, Simona ;
Sacco, Francesca ;
Tinti, Michele ;
Smolyar, Alex ;
Castagnoli, Luisa ;
Vidal, Marc ;
Cusick, Michael E. ;
Cesareni, Gianni .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D669-D673
[8]   Prediction of protein-protein interactions between viruses and human by an SVM model [J].
Cui, Guangyu ;
Fang, Chao ;
Han, Kyungsook .
BMC BIOINFORMATICS, 2012, 13
[9]   An overview of molecular epidemiology of hepatitis B virus (HBV) in India [J].
Datta, Sibnarayan .
VIROLOGY JOURNAL, 2008, 5 (1)
[10]   Host-pathogen protein interactions predicted by comparative modeling [J].
Davis, Fred P. ;
Barkan, David T. ;
Eswar, Narayanan ;
Mckerrow, James H. ;
Sali, Andrej .
PROTEIN SCIENCE, 2007, 16 (12) :2585-2596