Predicting Drug-Induced Liver Injury Using Ensemble Learning Methods and Molecular Fingerprints

被引:66
作者
Ai, Haixin [1 ,2 ,3 ]
Chen, Wen [4 ]
Zhang, Li [1 ,2 ,3 ]
Huang, Liangchao [4 ]
Yin, Zimo [4 ]
Hu, Huan [1 ]
Zhao, Qi [5 ]
Zhao, Jian [1 ]
Liu, Hongsheng [1 ,2 ,3 ]
机构
[1] Liaoning Univ, Sch Life Sci, Shenyang 110036, Liaoning, Peoples R China
[2] Res Ctr Comp Simulating & Informat Proc Biomacrom, Shenyang 110036, Liaoning, Peoples R China
[3] Engn Lab Mol Simulat & Designing Drug Mol Liaonin, Shenyang 110036, Liaoning, Peoples R China
[4] Liaoning Univ, Sch Informat, Shenyang 110036, Liaoning, Peoples R China
[5] Liaoning Univ, Sch Math, Shenyang 110036, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
DILI; hepatotoxicity; molecular fingerprints; machine learning; ensemble; COMPUTATIONAL TOXICOLOGY; MODELS;
D O I
10.1093/toxsci/kfy121
中图分类号
R99 [毒物学(毒理学)];
学科分类号
100405 ;
摘要
Drug-induced liver injury (DILI) is a major safety concern in the drug-development process, and various methods have been proposed to predict the hepatotoxicity of compounds during the early stages of drug trials. In this study, we developed an ensemble model using 3 machine learning algorithms and 12 molecular fingerprints from a dataset containing 1241 diverse compounds. The ensemble model achieved an average accuracy of 71.1 +/- 2.6%, sensitivity (SE) of 79.9 +/- 3.6%, specificity (SP) of 60.3 +/- 4.8%, and area under the receiver-operating characteristic curve (AUC) of 0.764 +/- 0.026 in 5-fold cross-validation and an accuracy of 84.3%, SE of 86.9%, SP of 75.4%, and AUC of 0.904 in an external validation dataset of 286 compounds collected from the Liver Toxicity Knowledge Base. Compared with previous methods, the ensemble model achieved relatively high accuracy and SE. We also identified several substructures related to DILI. In addition, we provide a web server offering access to our models (http://ccsipb.lnu.edu.cn/toxicity/HepatoPred-EL/).
引用
收藏
页码:100 / 107
页数:8
相关论文
共 35 条
[1]   Virtual screening of potential inhibitors from TCM for the CPSF30 binding site on the NS1A protein of influenza A virus [J].
Ai, Haixin ;
Zhang, Li ;
Chang, Alan K. ;
Wei, Hongyun ;
Che, Yuchen ;
Liu, Hongsheng .
JOURNAL OF MOLECULAR MODELING, 2014, 20 (03)
[2]  
Ai Haixin, 2010, Int J Bioinform Res Appl, V6, P449, DOI 10.1504/IJBRA.2010.037985
[3]  
[Anonymous], [No title captured]
[4]  
[Anonymous], 2014, SYST BIOL
[5]  
[Anonymous], J ROYAL STAT SOC
[6]  
[Anonymous], 2016, KDD16 P 22 ACM, DOI DOI 10.1145/2939672.2939785
[7]   A survey of cross-validation procedures for model selection [J].
Arlot, Sylvain ;
Celisse, Alain .
STATISTICS SURVEYS, 2010, 4 :40-79
[8]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[9]   Clinical characteristics and prognostic markers in disulfiram-induced liver injury [J].
Björnsson, E ;
Nordlinder, H ;
Olsson, R .
JOURNAL OF HEPATOLOGY, 2006, 44 (04) :791-797
[10]   The Liver Toxicity Knowledge Base: A Systems Approach to a Complex End Point [J].
Chen, M. ;
Zhang, J. ;
Wang, Y. ;
Liu, Z. ;
Kelly, R. ;
Zhou, G. ;
Fang, H. ;
Borlak, J. ;
Tong, W. .
CLINICAL PHARMACOLOGY & THERAPEUTICS, 2013, 93 (05) :409-412