Machine learning models for classification tasks related to drug safety

被引:36
作者
Racz, Anita [1 ]
Bajusz, David [2 ]
Miranda-Quintana, Ramon Alain [3 ,4 ]
Heberger, Karoly [1 ]
机构
[1] Res Ctr Nat Sci, Plasma Chem Res Grp, Magyar Tudosok Krt 2, H-1117 Budapest, Hungary
[2] Res Ctr Nat Sci, Med Chem Res Grp, Magyar Tudosok Krt 2, H-1117 Budapest, Hungary
[3] Univ Florida, Dept Chem, Gainesville, FL 32603 USA
[4] Univ Florida, Quantum Theory Project, Gainesville, FL 32603 USA
关键词
ADMET; Toxicity; Big data; QSAR; In silico modeling; Machine learning; BLOOD-BRAIN-BARRIER; IN-SILICO PREDICTION; POTASSIUM CHANNEL BLOCKAGE; P-GLYCOPROTEIN; ADMET EVALUATION; NEURAL-NETWORKS; EYE IRRITATION; CARCINOGENICITY; PERMEABILITY; INHIBITORS;
D O I
10.1007/s11030-021-10239-x
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.
引用
收藏
页码:1409 / 1424
页数:16
相关论文
共 119 条
[1]   A data base for partition of volatile organic compounds and drugs from blood/plasma/serum to brain, and an LFER analysis of the data [J].
Abraham, Michael H. ;
Ibrahim, Adam ;
Zhao, Yuan ;
Acree, William E., Jr. .
JOURNAL OF PHARMACEUTICAL SCIENCES, 2006, 95 (10) :2091-2100
[2]   Multi-Descriptor Read Across (MuDRA): A Simple and Transparent Approach for Developing Accurate Quantitative Structure-Activity Relationship Models [J].
Alves, Vinicius M. ;
Golbraikh, Alexander ;
Capuzzi, Stephen J. ;
Liu, Kammy ;
Lam, Wai In ;
Korn, Daniel Robert ;
Pozefsky, Diane ;
Andrade, Carolina Horta ;
Muratov, Eugene N. ;
Tropsha, Alexander .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (06) :1214-1223
[3]  
AMES BN, 1973, MUTAT RES, V21, P209
[4]  
[Anonymous], GLOB HARM SYST CLASS
[5]  
[Anonymous], 1984, Classifcation and Regression Trees
[6]  
[Anonymous], CHEM HAZ CLASS LAB
[7]   Integrated QSAR Models to Predict Acute Oral Systemic Toxicity [J].
Ballabio, Davide ;
Grisoni, Francesca ;
Consonni, Viviana ;
Todeschini, Roberto .
MOLECULAR INFORMATICS, 2019, 38 (8-9)
[8]   Alternatives to the carcinogenicity bioassay: in silico methods, and the in vitro and in vivo mutagenicity assays [J].
Benigni, Romualdo ;
Bossa, Cecilia ;
Tcheremenskaia, Olga ;
Giuliani, Alessandro .
EXPERT OPINION ON DRUG METABOLISM & TOXICOLOGY, 2010, 6 (07) :809-819
[9]  
Bolton EE, 2010, ANN REP COMP CHEM, V4, P217, DOI 10.1016/S1574-1400(08)00012-1
[10]   Pred-hERG: A Novel web-Accessible Computational Tool for Predicting Cardiac Toxicity [J].
Braga, Rodolpho C. ;
Alves, Vinicius M. ;
Silva, Meryck F. B. ;
Muratov, Eugene ;
Fourches, Denis ;
Liao, Luciano M. ;
Tropsha, Alexander ;
Andrade, Carolina H. .
MOLECULAR INFORMATICS, 2015, 34 (10) :698-701