Nonadditivity in public and inhouse data: implications for drug design

被引:12
作者
Gogishvili, D. [1 ,3 ]
Nittinger, E. [1 ]
Margreitter, C. [2 ]
Tyrchan, C. [1 ]
机构
[1] AstraZeneca, BioPharmaceut R&D, Med Chem Res & Early Dev, Resp & Immunol R&I, Gothenburg, Sweden
[2] AstraZeneca, R&D, Discovery Sci, Computat Chem, Gothenburg, Sweden
[3] Vrije Univ, Dept Comp Sci, De Boelelaan 1105, NL-1081 HV Amsterdam, Netherlands
关键词
Nonadditivity analysis; Structure-activity relationship; Matched molecular pair analysis; Experimental uncertainty; Machine learning; Support vector machine; Random forest; PROTEIN-LIGAND-BINDING; CCK1 RECEPTOR ANTAGONISTS; PHASE LIBRARY SYNTHESIS; ACTIVITY CLIFFS; DISCOVERY; PREDICTION; AFFINITY; SAR;
D O I
10.1186/s13321-021-00525-z
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Numerous ligand-based drug discovery projects are based on structure-activity relationship (SAR) analysis, such as Free-Wilson (FW) or matched molecular pair (MMP) analysis. Intrinsically they assume linearity and additivity of substituent contributions. These techniques are challenged by nonadditivity (NA) in protein-ligand binding where the change of two functional groups in one molecule results in much higher or lower activity than expected from the respective single changes. Identifying nonlinear cases and possible underlying explanations is crucial for a drug design project since it might influence which lead to follow. By systematically analyzing all AstraZeneca (AZ) inhouse compound data and publicly available ChEMBL25 bioactivity data, we show significant NA events in almost every second assay among the inhouse and once in every third assay in public data sets. Furthermore, 9.4% of all compounds of the AZ database and 5.1% from public sources display significant additivity shifts indicating important SAR features or fundamental measurement errors. Using NA data in combination with machine learning showed that nonadditive data is challenging to predict and even the addition of nonadditive data into training did not result in an increase in predictivity. Overall, NA analysis should be applied on a regular basis in many areas of computational chemistry and can further improve rational drug design.
引用
收藏
页数:18
相关论文
共 66 条
  • [1] An empirical extremum principle for the Hill coefficient in ligand-protein interactions showing negative cooperativity
    Abeliovich, H
    [J]. BIOPHYSICAL JOURNAL, 2005, 89 (01) : 76 - 79
  • [2] Off-Pocket Activity Cliffs: A Puzzling Facet of Molecular Recognition
    Abramyan, Tigran M.
    An, Yi
    Kireev, Dmitri
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (01) : 152 - 161
  • [3] Optuna: A Next-generation Hyperparameter Optimization Framework
    Akiba, Takuya
    Sano, Shotaro
    Yanase, Toshihiko
    Ohta, Takeru
    Koyama, Masanori
    [J]. KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 2623 - 2631
  • [4] Structure-based drug design of chromone antagonists of the adenosine A2A receptor
    Andrews, Stephen P.
    Mason, Jonathan S.
    Hurrell, Edward
    Congreve, Miles
    [J]. MEDCHEMCOMM, 2014, 5 (05) : 571 - 575
  • [5] [Anonymous], 2015, arXiv
  • [6] Exploring the GDB-13 chemical space using deep generative models
    Arus-Pous, Josep
    Blaschke, Thomas
    Ulander, Silas
    Reymond, Jean-Louis
    Chen, Hongming
    Engkvist, Ola
    [J]. JOURNAL OF CHEMINFORMATICS, 2019, 11 (1)
  • [7] Deconstructing fragment-based inhibitor discovery
    Babaoglu, Kerim
    Shoichet, Brian K.
    [J]. NATURE CHEMICAL BIOLOGY, 2006, 2 (12) : 720 - 723
  • [8] A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking
    Ballester, Pedro J.
    Mitchell, John B. O.
    [J]. BIOINFORMATICS, 2010, 26 (09) : 1169 - 1175
  • [9] Non-additivity of Functional Group Contributions in Protein Ligand Binding: A Comprehensive Study by Crystallography and Isothermal Titration Calorimetry
    Baum, Bernhard
    Muley, Laveena
    Smolinski, Michael
    Heine, Andreas
    Hangauer, David
    Klebe, Gerhard
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2010, 397 (04) : 1042 - 1054
  • [10] Blaschke T., 2020, J CHEM INF MODEL, DOI [10.26434/CHEMRXIV.12058026.V2, DOI 10.26434/CHEMRXIV.12058026.V2]