On the robustness of generalization of drug-drug interaction models

被引:3
作者
Kpanou, Rogia [1 ,2 ]
Osseni, Mazid Abiodoun [1 ]
Tossou, Prudencio [1 ,2 ]
Laviolette, Francois [1 ]
Corbeil, Jacques [3 ]
机构
[1] Univ Laval, Comp Sci & Software Engn, 1065 Av Med, Quebec City, PQ, Canada
[2] InVivo AI, Mila 180 Corp Lab L,6650,01 Rue St Urba, Montreal, PQ H2S 3G9, Canada
[3] Univ Laval, Dept Mol Med, 1065 Av Med, Quebec City, PQ, Canada
关键词
Drug-drug interaction; Side effects; Deep learning; Robustness; Generalizability; LANGUAGE; NETWORK; SYSTEM;
D O I
10.1186/s12859-021-04398-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background Deep learning methods are a proven commodity in many fields and endeavors. One of these endeavors is predicting the presence of adverse drug-drug interactions (DDIs). The models generated can predict, with reasonable accuracy, the phenotypes arising from the drug interactions using their molecular structures. Nevertheless, this task requires improvement to be truly useful. Given the complexity of the predictive task, an extensive benchmarking on structure-based models for DDIs prediction was performed to evaluate their drawbacks and advantages. Results We rigorously tested various structure-based models that predict drug interactions using different splitting strategies to simulate different real-world scenarios. In addition to the effects of different training and testing setups on the robustness and generalizability of the models, we then explore the contribution of traditional approaches such as multitask learning and data augmentation. Conclusion Structure-based models tend to generalize poorly to unseen drugs despite their ability to identify new DDIs among drugs seen during training accurately. Indeed, they efficiently propagate information between known drugs and could be valuable for discovering new DDIs in a database. However, these models will most probably fail when exposed to unknown drugs. While multitask learning does not help in our case to solve the problem, the use of data augmentation does at least mitigate it. Therefore, researchers must be cautious of the bias of the random evaluation scheme, especially if their goal is to discover new DDIs.
引用
收藏
页数:21
相关论文
共 59 条
  • [1] Randomized SMILES strings improve the quality of molecular generative models
    Arus-Pous, Josep
    Johansson, Simon Viet
    Prykhodko, Oleksii
    Bjerrum, Esben Jannik
    Tyrchan, Christian
    Reymond, Jean-Louis
    Chen, Hongming
    Engkvist, Ola
    [J]. JOURNAL OF CHEMINFORMATICS, 2019, 11 (01)
  • [2] Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?
    Bajusz, David
    Racz, Anita
    Heberger, Kroly
    [J]. JOURNAL OF CHEMINFORMATICS, 2015, 7
  • [3] A community computational challenge to predict the activity of pairs of compounds
    Bansal, Mukesh
    Yang, Jichen
    Karan, Charles
    Menden, Michael P.
    Costello, James C.
    Tang, Hao
    Xiao, Guanghua
    Li, Yajuan
    Allen, Jeffrey
    Zhong, Rui
    Chen, Beibei
    Kim, Minsoo
    Wang, Tao
    Heiser, Laura M.
    Realubit, Ronald
    Mattioli, Michela
    Alvarez, Mariano J.
    Shen, Yao
    Gallahan, Daniel
    Singer, Dinah
    Saez-Rodriguez, Julio
    Xie, Yang
    Stolovitzky, Gustavo
    Califano, Andrea
    Abbuehl, Jean-Paul
    Altman, Russ B.
    Balcome, Shawn
    Bell, Ana
    Bender, Andreas
    Berger, Bonnie
    Bernard, Jonathan
    Bieberich, Andrew A.
    Borboudakis, Giorgos
    Chan, Christina
    Chen, Ting-Huei
    Choi, Jaejoon
    Coelho, Luis Pedro
    Creighton, Chad J.
    Dampier, Will
    Davisson, V. Jo
    Deshpande, Raamesh
    Diao, Lixia
    Di Camillo, Barbara
    Dundar, Murat
    Ertel, Adam
    Goswami, Chirayu P.
    Gottlieb, Assaf
    Gould, Michael N.
    Goya, Jonathan
    Grau, Michael
    [J]. NATURE BIOTECHNOLOGY, 2014, 32 (12) : 1213 - +
  • [4] Exploiting task relatedness for multiple task learning
    Ben-David, S
    Schuller, R
    [J]. LEARNING THEORY AND KERNEL MACHINES, 2003, 2777 : 567 - 580
  • [5] Bjerrum E.J., 2017, ARXIV PREPRINT ARXIV
  • [6] The Unified Medical Language System (UMLS): integrating biomedical terminology
    Bodenreider, O
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D267 - D270
  • [7] Synergy evaluation by a pathway-pathway interaction network: a new way to predict drug combination
    Chen, Di
    Zhang, Huamin
    Lu, Peng
    Liu, Xianli
    Cao, Hongxin
    [J]. MOLECULAR BIOSYSTEMS, 2016, 12 (02) : 614 - 623
  • [8] NLLSS: Predicting Synergistic Drug Combinations Based on Semi-supervised Learning
    Chen, Xing
    Ren, Biao
    Chen, Ming
    Wang, Quanxin
    Zhang, Lixin
    Yan, Guiying
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2016, 12 (07)
  • [9] MUFFIN: multi-scale feature fusion for drug-drug interaction prediction
    Chen, Yujie
    Ma, Tengfei
    Yang, Xixi
    Wang, Jianmin
    Song, Bosheng
    Zeng, Xiangxiang
    [J]. BIOINFORMATICS, 2021, 37 (17) : 2651 - 2658
  • [10] Machine learning-based prediction of drug-drug interactions by integrating drug phenotypic, therapeutic, chemical, and genomic properties
    Cheng, Feixiong
    Zhao, Zhongming
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2014, 21 (E2) : E278 - E286