ADMET Predictability at Boehringer Ingelheim: State-of-the-Art, and Do Bigger Datasets or Algorithms Make a Difference?

被引:32
作者
Aleksic, Stevan [1 ]
Seeliger, Daniel [1 ]
Brown, J. B. [1 ]
机构
[1] Boehringer Ingelheim Pharma GmbH & Co KG, Med Chem, D-88397 Biberach, Germany
关键词
ADMET modelling; machine learning; algorithm comparison; chemical representation; congeneric series; DRUG DISCOVERY; WEB SERVICES; PREDICTION; ACCESS; TOX;
D O I
10.1002/minf.202100113
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Computational methods assisting drug discovery and development are routine in the pharmaceutical industry. Digital recording of ADMET assays has provided a rich source of data for development of predictive models. Despite the accumulation of data and the public availability of advanced modeling algorithms, the utility of prediction in ADMET research is not clear. Here, we present a critical evaluation of the relationships between data volume, modeling algorithm, chemical representation and grouping, and temporal aspect (time sequence of assays) using an in-house ADMET database. We find no large difference in prediction algorithms nor any systemic and substantial gain from increasingly large datasets. Temporal-based data enlargement led to performance improvement in only in a limited number of assays, and with fractional improvement at best. Assays that are well-, intermediately-, or poorly-suited for ADMET predictions and reasons for such behavior are systematically identified, generating realistic expectations for areas in which computational models can be used to guide decision making in molecular design and development.
引用
收藏
页数:16
相关论文
共 51 条
[1]   In silico ADME-Tox modeling: progress and prospects [J].
Alqahtani, Saeed .
EXPERT OPINION ON DRUG METABOLISM & TOXICOLOGY, 2017, 13 (11) :1147-1158
[2]  
[Anonymous], 2020, INSTANT JCHEM
[3]   Classifiers and their Metrics Quantified [J].
Brown, J. B. .
MOLECULAR INFORMATICS, 2018, 37 (1-2)
[4]  
Brown N., 2013, DRUG DISCOV TODAY, V10, pE435
[5]   Unsupervised data base clustering based on Daylight's fingerprint and Tanimoto similarity: A fast and automated way to cluster small and large data sets [J].
Butina, D .
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 1999, 39 (04) :747-750
[6]   Deep learning approaches in predicting ADMET properties [J].
Caceres, Elena L. ;
Tudor, Matthew ;
Cheng, Alan C. .
FUTURE MEDICINAL CHEMISTRY, 2020, 12 (22) :1995-1999
[7]   Modeling Caco-2 permeability of drugs using immobilized artificial membrane chromatography and physicochemical descriptors [J].
Chan, ECY ;
Tan, WL ;
Ho, PC ;
Fang, LJ .
JOURNAL OF CHROMATOGRAPHY A, 2005, 1072 (02) :159-168
[8]   QSAR Modeling: Where Have You Been? Where Are You Going To? [J].
Cherkasov, Artem ;
Muratov, Eugene N. ;
Fourches, Denis ;
Varnek, Alexandre ;
Baskin, Igor I. ;
Cronin, Mark ;
Dearden, John ;
Gramatica, Paola ;
Martin, Yvonne C. ;
Todeschini, Roberto ;
Consonni, Viviana ;
Kuz'min, Victor E. ;
Cramer, Richard ;
Benigni, Romualdo ;
Yang, Chihae ;
Rathman, James ;
Terfloth, Lothar ;
Gasteiger, Johann ;
Richard, Ann ;
Tropsha, Alexander .
JOURNAL OF MEDICINAL CHEMISTRY, 2014, 57 (12) :4977-5010
[9]   The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation [J].
Chicco, Davide ;
Jurman, Giuseppe .
BMC GENOMICS, 2020, 21 (01)
[10]  
Czodrowski P., 2019, APPL VITRO TOXICOL, V5, P86