Massively Parallel Processing Database for Sequence and Graph Data Structures Applied to Rapid-Response Drug Repurposing

被引:3
作者
Rickett, Christopher D. [1 ]
Maschhoff, Kristyn J. [1 ]
Sukumar, Sreenivas R. [1 ]
机构
[1] Hewlett Packard Enterprise, Spring, TX 77389 USA
来源
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2020年
关键词
graph database; graph analytics; in-database analytics; distributed processing; parallel processing; sequence analytics;
D O I
10.1109/BigData50022.2020.9378331
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present the application of a massively parallel-processing graph database for rapid-response drug repurposing. The novelty of our approach is that the scalable graph database is able to host a knowledge graph of medically relevant facts integrated from multiple knowledge sources and also act as a computational engine capable of in-database protein sequence analytics. We demonstrate the performance of the graph database on a real-world use-case to hypothesize cures for COVID-19, leveraging its built-in accelerated protein-sequence matching capabilities at unprecedented scale (to simultaneously handle data size and query latency requirements for interactive research). Based on supporting evidence from medical literature, we show that results generated by computing similarity of COVID-19 virus proteins across 4 million other open-science sequences and intelligently traversing over a 150 billion facts from open-science medical knowledge produces biologically insightful results. By presenting sample queries and extending application to use-cases beyond COVID-19, we demonstrate the use and value of the novel database for hypotheses generation in reducing the time-to-insight and increasing researcher productivity with interactivity.
引用
收藏
页码:2967 / 2976
页数:10
相关论文
共 32 条
  • [1] Heterologous Immunity: Role in Natural and Vaccine-Induced Resistance to Infections
    Agrawal, Babita
    [J]. FRONTIERS IN IMMUNOLOGY, 2019, 10
  • [2] [Anonymous], 2020, EMERG INFECT DIS
  • [3] [Anonymous], 2018, TECH REP
  • [4] UniProt: a worldwide hub of protein knowledge
    Bateman, Alex
    Martin, Maria-Jesus
    Orchard, Sandra
    Magrane, Michele
    Alpi, Emanuele
    Bely, Benoit
    Bingley, Mark
    Britto, Ramona
    Bursteinas, Borisas
    Busiello, Gianluca
    Bye-A-Jee, Hema
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Castro, Leyla Garcia
    Garmiri, Penelope
    Georghiou, George
    Gonzales, Daniel
    Gonzales, Leonardo
    Hatton-Ellis, Emma
    Ignatchenko, Alexandr
    Ishtiaq, Rizwan
    Jokinen, Petteri
    Joshi, Vishal
    Jyothi, Dushyanth
    Lopez, Rodrigo
    Luo, Jie
    Lussi, Yvonne
    MacDougall, Alistair
    Madeira, Fabio
    Mahmoudy, Mahdi
    Menchi, Manuela
    Nightingale, Andrew
    Onwubiko, Joseph
    Palka, Barbara
    Pichler, Klemens
    Pundir, Sangya
    Qi, Guoying
    Raj, Shriya
    Renaux, Alexandre
    Lopez, Milagros Rodriguez
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Speretta, Elena
    Turner, Edward
    Tyagi, Nidhi
    Vasudev, Preethi
    Volynkin, Vladimir
    Wardell, Tony
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D506 - D515
  • [5] Bio2RDF: Towards a mashup to build bioinformatics knowledge systems
    Belleau, Francois
    Nolin, Marc-Alexandre
    Tourigny, Nicole
    Rigault, Philippe
    Morissette, Jean
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2008, 41 (05) : 706 - 716
  • [6] Cha A. E., 2020, 40 PERCENT PEOPLE CO
  • [7] BioModels: ten-year anniversary
    Chelliah, Vijayalakshmi
    Juty, Nick
    Ajmera, Ishan
    Ali, Raza
    Dumousseau, Marine
    Glont, Mihai
    Hucka, Michael
    Jalowicki, Gael
    Keating, Sarah
    Knight-Schrijver, Vincent
    Lloret-Villas, Audald
    Natarajan, Kedar Nath
    Pettit, Jean-Baptiste
    Rodriguez, Nicolas
    Schubert, Michael
    Wimalaratne, Sarala M.
    Zhao, Yangyang
    Hermjakob, Henning
    Le Novere, Nicolas
    Laibe, Camille
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) : D542 - D548
  • [8] The Reactome Pathway Knowledgebase
    Fabregat, Antonio
    Jupe, Steven
    Matthews, Lisa
    Sidiropoulos, Konstantinos
    Gillespie, Marc
    Garapati, Phani
    Haw, Robin
    Jassal, Bijay
    Korninger, Florian
    May, Bruce
    Milacic, Marija
    Roca, Corina Duenas
    Rothfels, Karen
    Sevilla, Cristoffer
    Shamovsky, Veronica
    Shorser, Solomon
    Varusai, Thawfeek
    Viteri, Guilherme
    Weiser, Joel
    Wu, Guanming
    Stein, Lincoln
    Hermjakob, Henning
    D'Eustachio, Peter
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) : D649 - D655
  • [9] LUBM: A benchmark for OWL knowledge base systems
    Guo, YB
    Pan, ZX
    Heflin, J
    [J]. JOURNAL OF WEB SEMANTICS, 2005, 3 (2-3): : 158 - 182
  • [10] Gysi D.M., 2020, Network Medicine Framework for Identifying Drug Repurposing Opportunities for COVID-19