Scaling Machine Learning for Target Prediction in Drug Discovery using Apache Spark

被引:5
|
作者
Harnie, Dries [1 ,3 ]
Vapirev, Alexander E. [2 ,3 ]
Wegner, Jorg Kurt [2 ]
Gedich, Andrey [6 ]
Steijaert, Marvin [7 ]
Wuyts, Roel [3 ,4 ,5 ]
De Meuter, Wolfgang [1 ]
机构
[1] Vrije Univ Brussel, Software Languages Lab, Pl Laan 2, B-1050 Brussels, Belgium
[2] Janssen Pharmaceut, B-2340 Beerse, Belgium
[3] ExaSci Life Lab, B-3001 Leuven, Belgium
[4] IMEC, B-3001 Leuven, Belgium
[5] Katholieke Univ Leuven, DistriNet, B-3001 Leuven, Belgium
[6] ARCADIA Inc, Rostra Business Ctr, St Petersburg 195112, Russia
[7] OpenAnalytics, B-2220 Heist Op Den Berg, Belgium
来源
2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING | 2015年
关键词
IDENTIFICATION; TOOL;
D O I
10.1109/CCGrid.2015.50
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the context of drug discovery, a key problem is the identification of candidate molecules that affect proteins associated with diseases. Inside Janssen Pharmaceutica, the Chemogenomics project aims to derive new candidates from existing experiments through a set of machine learning predictor programs, written in single-node C++. These programs take a long time to run and are inherently parallel, but do not use multiple nodes. We show how we reimplemented the pipeline using Apache Spark, which enabled us to lift the existing programs to a multi-node cluster without making changes to the predictors. We have benchmarked our Spark pipeline against the original, which shows almost linear speedup up to 8 nodes. In addition, our pipeline generates fewer intermediate files while allowing easier checkpointing and monitoring.
引用
收藏
页码:871 / 879
页数:9
相关论文
共 50 条
  • [21] Bioactivity Comparison across Multiple Machine Learning Algorithms Using over 5000 Datasets for Drug Discovery
    Lane, Thomas R.
    Foil, Daniel H.
    Minerali, Eni
    Urbina, Fabio
    Zorn, Kimberley M.
    Ekins, Sean
    MOLECULAR PHARMACEUTICS, 2021, 18 (01) : 403 - 415
  • [22] Prediction of Drug Bioactivity in Alzheimer's Disease Using Machine Learning Techniques and Community Networks
    Hemkiran, S.
    Sadasivam, G. Sudha
    CURRENT BIOINFORMATICS, 2022, 17 (08) : 698 - 709
  • [23] Drug Target Identification with Machine Learning: How to Choose Negative Examples
    Najm, Matthieu
    Azencott, Chloe-Agathe
    Playe, Benoit
    Stoven, Veronique
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (10)
  • [24] Drug-target interaction prediction using semi-bipartite graph model and deep learning
    Eslami Manoochehri, Hafez
    Nourani, Mehrdad
    BMC BIOINFORMATICS, 2020, 21 (Suppl 4)
  • [25] Prediction of Drug-Target Interactions Based on Network Representation Learning and Ensemble Learning
    Xuan, Ping
    Chen, Bingxu
    Zhang, Tiangang
    Yang, Yan
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (06) : 2671 - 2681
  • [26] Artificial intelligence: Machine learning approach for screening large database and drug discovery
    Parvatikar, Prachi P.
    Patil, Sudha
    Khaparkhuntikar, Kedar
    Patil, Shruti
    Singh, Pankaj K.
    Sahana, R.
    Kulkarni, Raghavendra, V
    Raghu, Anjanapura, V
    ANTIVIRAL RESEARCH, 2023, 220
  • [27] Unleashing the future: The revolutionary role of machine learning and artificial intelligence in drug discovery
    Yadav, Manoj Kumar
    Dahiya, Vandana
    Tripathi, Manish Kumar
    Chaturvedi, Navaneet
    Rashmi, Mayank
    Ghosh, Arabinda
    Raj, V. Samuel
    EUROPEAN JOURNAL OF PHARMACOLOGY, 2024, 985
  • [28] Deep learning in target prediction and drug repositioning: Recent advances and challenges
    Yu, Jun-Lin
    Dai, Qing-Qing
    Li, Guo-Bo
    DRUG DISCOVERY TODAY, 2022, 27 (07) : 1796 - 1814
  • [29] Artificial Intelligence, Machine Learning, and Big Data for Ebola Virus Drug Discovery
    Kwofie, Samuel K.
    Adams, Joseph
    Broni, Emmanuel
    Enninful, Kweku S.
    Agoni, Clement
    Soliman, Mahmoud E. S.
    Wilson, Michael D.
    PHARMACEUTICALS, 2023, 16 (03)
  • [30] Supervised machine learning in drug discovery and development: Algorithms, applications, challenges, and prospects
    Obaido, George
    Mienye, Ibomoiye Domor
    Egbelowo, Oluwaseun F.
    Emmanuel, Ikiomoye Douglas
    Ogunleye, Adeola
    Ogbuokiri, Blessing
    Mienye, Pere
    Aruleba, Kehinde
    MACHINE LEARNING WITH APPLICATIONS, 2024, 17