A Comprehensive Investigation of Active Learning Strategies for Conducting Anti-Cancer Drug Screening

被引:1
作者
Vasanthakumari, Priyanka [1 ]
Zhu, Yitan [1 ]
Brettin, Thomas [2 ]
Partin, Alexander [1 ]
Shukla, Maulik [1 ]
Xia, Fangfang [1 ]
Narykov, Oleksandr [1 ]
Weil, Michael Ryan [3 ]
Stevens, Rick L. [2 ,4 ]
机构
[1] Argonne Natl Lab, Div Data Sci & Learning, Lemont, IL 60439 USA
[2] Argonne Natl Lab, Comp Environm & Life Sci, Lemont, IL 60439 USA
[3] Frederick Natl Lab Canc Res, Canc Data Sci Initiat, Canc Res Technol Program, Rockville 21701, MD USA
[4] Univ Chicago, Dept Comp Sci, Chicago, IL 60637 USA
关键词
active learning; machine learning; drug response prediction; drug discovery; cancer; RESPONSES; NETWORKS;
D O I
10.3390/cancers16030530
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Preclinical drug screening experiments for anti-cancer drug discovery typically involve testing candidate drugs against cancer cell lines. This process can be expensive and time consuming since the possible experimental space can be quite huge, involving all of the combinations of candidate cell lines and drugs. Guiding drug screening experiments with active learning strategies could potentially identify promising candidates for successful experimentation. This study investigates various active learning strategies for selecting experiments to generate response data for identifying effective treatments and improving the performance of drug response prediction models. We have demonstrated that most active learning strategies are more efficient than random selection for identifying effective treatments.Abstract It is well-known that cancers of the same histology type can respond differently to a treatment. Thus, computational drug response prediction is of paramount importance for both preclinical drug screening studies and clinical treatment design. To build drug response prediction models, treatment response data need to be generated through screening experiments and used as input to train the prediction models. In this study, we investigate various active learning strategies of selecting experiments to generate response data for the purposes of (1) improving the performance of drug response prediction models built on the data and (2) identifying effective treatments. Here, we focus on constructing drug-specific response prediction models for cancer cell lines. Various approaches have been designed and applied to select cell lines for screening, including a random, greedy, uncertainty, diversity, combination of greedy and uncertainty, sampling-based hybrid, and iteration-based hybrid approach. All of these approaches are evaluated and compared using two criteria: (1) the number of identified hits that are selected experiments validated to be responsive, and (2) the performance of the response prediction model trained on the data of selected experiments. The analysis was conducted for 57 drugs and the results show a significant improvement on identifying hits using active learning approaches compared with the random and greedy sampling method. Active learning approaches also show an improvement on response prediction performance for some of the drugs and analysis runs compared with the greedy sampling method.
引用
收藏
页数:18
相关论文
共 64 条
  • [1] Systematic identification of feature combinations for predicting drug response with Bayesian multi-view multi-task linear regression
    Ammad-ud-din, Muhammad
    Khan, Suleiman A.
    Wennerberg, Krister
    Aittokallio, Tero
    [J]. BIOINFORMATICS, 2017, 33 (14) : I359 - I368
  • [2] An Interactive Resource to Identify Cancer Genetic and Lineage Dependencies Targeted by Small Molecules
    Basu, Amrita
    Bodycombe, Nicole E.
    Cheah, Jaime H.
    Price, Edmund V.
    Liu, Ke
    Schaefer, Giannina I.
    Ebright, Richard Y.
    Stewart, Michelle L.
    Ito, Daisuke
    Wang, Stephanie
    Bracha, Abigail L.
    Liefeld, Ted
    Wawer, Mathias
    Gilbert, Joshua C.
    Wilson, Andrew J.
    Stransky, Nicolas
    Kryukov, Gregory V.
    Dancik, Vlado
    Barretina, Jordi
    Garraway, Levi A.
    Hon, C. Suk-Yee
    Munoz, Benito
    Bittker, Joshua A.
    Stockwell, Brent R.
    Khabele, Dineo
    Stern, Andrew M.
    Clemons, Paul A.
    Shamji, Alykhan F.
    Schreiber, Stuart L.
    [J]. CELL, 2013, 154 (05) : 1151 - 1161
  • [3] Bertin P, 2022, Arxiv, DOI [arXiv:2202.04202, DOI arXiv:2202.04202.v1]
  • [4] Paclitaxel Response Can Be Predicted With Interpretable Multi-Variate Classifiers Exploiting DNA-Methylation and miRNA Data
    Bomane, Alexandra
    Goncalves, Anthony
    Ballester, Pedro J.
    [J]. FRONTIERS IN GENETICS, 2019, 10
  • [5] A survey on active learning and human-in-the-loop deep learning for medical image analysis
    Budd, Samuel
    Robinson, Emma C.
    Kainz, Bernhard
    [J]. MEDICAL IMAGE ANALYSIS, 2021, 71
  • [6] Predicting and characterizing a cancer dependency map of tumors with deep learning
    Chiu, Yu-Chiao
    Zheng, Siyuan
    Wang, Li-Ju
    Iskra, Brian S.
    Rao, Manjeet K.
    Houghton, Peter J.
    Huang, Yufei
    Chen, Yidong
    [J]. SCIENCE ADVANCES, 2021, 7 (34)
  • [7] Graph Transformer for Drug Response Prediction
    Chu, Thang
    Nguyen, Thuy Trang
    Hai, Bui Duong
    Nguyen, Quang Huy
    Nguyen, Tuan
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (02) : 1065 - 1072
  • [8] Active Learning for Improved Semi-Supervised Semantic Segmentation in Satellite Images
    Desai, Shasvat
    Ghose, Debasmita
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 1485 - 1495
  • [9] Evaluating the molecule-based prediction of clinical drug responses in cancer
    Ding, Zijian
    Zu, Songpeng
    Gu, Jin
    [J]. BIOINFORMATICS, 2016, 32 (19) : 2891 - 2895
  • [10] A survey on ensemble learning
    Dong, Xibin
    Yu, Zhiwen
    Cao, Wenming
    Shi, Yifan
    Ma, Qianli
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (02) : 241 - 258