APEX2S: A two-layer machine learning model for discovery of host-pathogen protein-protein interactions on cloud-based multiomics data

被引:4
作者
Chen, Huaming [1 ]
Shen, Jun [1 ]
Wang, Lei [1 ]
Chi, Chi-Hung [2 ]
机构
[1] Univ Wollongong, Sch Comp & Informat Technol, Wollongong, NSW 2500, Australia
[2] CSIRO, Data61, Canberra, ACT, Australia
关键词
big data; computational biology; data analysis; machine learning; PREDICTION; ORGANIZATION; SUPPORT; TOOL;
D O I
10.1002/cpe.5846
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Presented by the avalanche of biological interactions data, computational biology is now facing greater challenges on big data analysis and solicits more studies to mine and integrate cloud-based multiomics data, especially when the data are related to infectious diseases. Meanwhile, machine learning techniques have recently succeeded in different computational biology tasks. In this article, we have calibrated the focus for host-pathogen protein-protein interactions study, aiming to apply the machine learning techniques for learning the interactions data and making predictions. A comprehensive and practical workflow to harness different cloud-based multiomics data is discussed. In particular, a novel two-layer machine learning model, namely APEX2S, is proposed for discovery of the protein-protein interactions data. The results show that our model can better learn and predict from the accumulated host-pathogen protein-protein interactions.
引用
收藏
页数:16
相关论文
共 71 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   HPIDB 2.0: a curated database for host-pathogen interactions [J].
Ammari, Mais G. ;
Gresham, Cathy R. ;
McCarthy, Fiona M. ;
Nanduri, Bindu .
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]  
AURRECOECHEA C, 2009, NUCL ACIDS RES S1, V38, P415
[5]  
BAI T, 2018, CONCURR COMP-PRACT E, V32, P1
[6]   NCBI GEO: archive for functional genomics data sets-update [J].
Barrett, Tanya ;
Wilhite, Stephen E. ;
Ledoux, Pierre ;
Evangelista, Carlos ;
Kim, Irene F. ;
Tomashevsky, Maxim ;
Marshall, Kimberly A. ;
Phillippy, Katherine H. ;
Sherman, Patti M. ;
Holko, Michelle ;
Yefanov, Andrey ;
Lee, Hyeseung ;
Zhang, Naigong ;
Robertson, Cynthia L. ;
Serova, Nadezhda ;
Davis, Sean ;
Soboleva, Alexandra .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D991-D995
[7]   UniProt: a worldwide hub of protein knowledge [J].
Bateman, Alex ;
Martin, Maria-Jesus ;
Orchard, Sandra ;
Magrane, Michele ;
Alpi, Emanuele ;
Bely, Benoit ;
Bingley, Mark ;
Britto, Ramona ;
Bursteinas, Borisas ;
Busiello, Gianluca ;
Bye-A-Jee, Hema ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Georghiou, George ;
Gonzales, Daniel ;
Gonzales, Leonardo ;
Hatton-Ellis, Emma ;
Ignatchenko, Alexandr ;
Ishtiaq, Rizwan ;
Jokinen, Petteri ;
Joshi, Vishal ;
Jyothi, Dushyanth ;
Lopez, Rodrigo ;
Luo, Jie ;
Lussi, Yvonne ;
MacDougall, Alistair ;
Madeira, Fabio ;
Mahmoudy, Mahdi ;
Menchi, Manuela ;
Nightingale, Andrew ;
Onwubiko, Joseph ;
Palka, Barbara ;
Pichler, Klemens ;
Pundir, Sangya ;
Qi, Guoying ;
Raj, Shriya ;
Renaux, Alexandre ;
Lopez, Milagros Rodriguez ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Speretta, Elena ;
Turner, Edward ;
Tyagi, Nidhi ;
Vasudev, Preethi ;
Volynkin, Vladimir ;
Wardell, Tony .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D506-D515
[8]   Computational solutions for omics data [J].
Berger, Bonnie ;
Peng, Jian ;
Singh, Mona .
NATURE REVIEWS GENETICS, 2013, 14 (05) :333-346
[9]  
Berman HM, 2003, PROTEIN STRUCTURE: DETERMINATION, ANALYSIS, AND APPLICATIONS FOR DRUG DISCOVERY, P389
[10]   The EHEC-host interactome reveals novel targets for the translocated intimin receptor [J].
Blasche, Sonja ;
Arens, Stefan ;
Ceol, Arnaud ;
Siszler, Gabriella ;
Schmidt, M. Alexander ;
Haeuser, Roman ;
Schwarz, Frank ;
Wuchty, Stefan ;
Aloy, Patrick ;
Uetz, Peter ;
Stradal, Theresia ;
Koegl, Manfred .
SCIENTIFIC REPORTS, 2014, 4