Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI)

被引：59

作者：

Minerali, Eni ^{[1
]}

Foil, Daniel H. ^{[1
]}

Zorn, Kimberley M. ^{[1
]}

Lane, Thomas R. ^{[1
]}

Ekins, Sean ^{[1
]}

机构：

[1] Collaborat Pharmaceut Inc, Raleigh, NC 27606 USA

来源：

MOLECULAR PHARMACEUTICS | 2020年 / 17卷 / 07期

基金：

美国国家卫生研究院;

关键词：

Assay Central; bayesian; drug-induced liver injury; machine learning; MegaTox; HEPATOTOXICITY; TOXICOLOGY; ATRIUM(R); WITHDRAWN; AGREEMENT; MODEL; RISK;

D O I：

10.1021/acs.molpharmaceut.0c00326

中图分类号：

R-3 [医学研究方法]; R3 [基础医学];

学科分类号：

1001 ;

摘要：

Drug-induced liver injury (DILI) is one the most unpredictable adverse reactions to xenobiotics in humans and the leading cause of postmarketing withdrawals of approved drugs. To date, these drugs have been collated by the FDA to form the DILIRank database, which classifies DILI severity and potential. These classifications have been used by various research groups in generating computational predictions for this type of liver injury. Recently, groups from Pfizer and AstraZeneca have collated DILI in vitro data and physicochemical properties for compounds that can be used along with data from the FDA to build machine learning models for DILI. In this study, we have used these data sets, as well as the Biopharmaceutics Drug Disposition Classification System data set, to generate Bayesian machine learning models with our inhouse software, Assay Central. The performance of all machine learning models was assessed through both the internal 5-fold cross-validation metrics and prediction accuracy of an external test set of compounds with known hepatotoxicity. The best-performing Bayesian model was based on the DILI-concern category from the DILIRank database with an ROC of 0.814, a sensitivity of 0.741, a specificity of 0.755, and an accuracy of 0.746. A comparison of alternative machine learning algorithms, such as k-nearest neighbors, support vector classification, AdaBoosted decision trees, and deep learning methods, produced similar statistics to those generated with the Bayesian algorithm in Assay Central. This study demonstrates machine learning models grouped in a tool called MegaTox that can be used to predict early-stage clinical compounds, as well as recent FDA-approved drugs, to identify potential DILI.

引用

页码：2628 / 2637

页数：10

共 59 条

[1] Predicting Drug-Induced Liver Injury Using Ensemble Learning Methods and Molecular Fingerprints [J].