Predicting Drug-Induced Liver Injury Using Machine Learning on a Diverse Set of Predictors

被引:13
作者
Adeluwa, Temidayo [1 ]
McGregor, Brett A. [1 ]
Guo, Kai [1 ,2 ]
Hur, Junguk [1 ]
机构
[1] Univ North Dakota, Dept Biomed Sci, Grand Forks, ND 58202 USA
[2] Univ Michigan, Dept Neurol, Ann Arbor, MI USA
关键词
DILI; Connectivity Map; Tox21; FAERS; machine learning; Mold2; MATHEMATICAL STRUCTURE;
D O I
10.3389/fphar.2021.648805
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
A major challenge in drug development is safety and toxicity concerns due to drug side effects. One such side effect, drug-induced liver injury (DILI), is considered a primary factor in regulatory clearance. The Critical Assessment of Massive Data Analysis (CAMDA) 2020 CMap Drug Safety Challenge goal was to develop prediction models based on gene perturbation of six preselected cell-lines (CMap L1000), extended structural information (MOLD2), toxicity data (TOX21), and FDA reporting of adverse events (FAERS). Four types of DILI classes were targeted, including two clinically relevant scores and two control classifications, designed by the CAMDA organizers. The L1000 gene expression data had variable drug coverage across cell lines with only 247 out of 617 drugs in the study measured in all six cell types. We addressed this coverage issue by using Kru-Bor ranked merging to generate a singular drug expression signature across all six cell lines. These merged signatures were then narrowed down to the top and bottom 100, 250, 500, or 1,000 genes most perturbed by drug treatment. These signatures were subject to feature selection using Fisher's exact test to identify genes predictive of DILI status. Models based solely on expression signatures had varying results for clinical DILI subtypes with an accuracy ranging from 0.49 to 0.67 and Matthews Correlation Coefficient (MCC) values ranging from -0.03 to 0.1. Models built using FAERS, MOLD2, and TOX21 also had similar results in predicting clinical DILI scores with accuracy ranging from 0.56 to 0.67 with MCC scores ranging from 0.12 to 0.36. To incorporate these various data types with expression-based models, we utilized soft, hard, and weighted ensemble voting methods using the top three performing models for each DILI classification. These voting models achieved a balanced accuracy up to 0.54 and 0.60 for the clinically relevant DILI subtypes. Overall, from our experiment, traditional machine learning approaches may not be optimal as a classification method for the current data.
引用
收藏
页数:14
相关论文
共 46 条
[1]   An ensemble learning approach for modeling the systems biology of drug-induced injury [J].
Aguirre-Plans, Joaquim ;
Pinero, Janet ;
Souza, Terezinha ;
Callegaro, Giulia ;
Kunnen, Steven J. ;
Sanz, Ferran ;
Fernandez-Fuentes, Narcis ;
Furlong, Laura I. ;
Guney, Emre ;
Oliva, Baldo .
BIOLOGY DIRECT, 2021, 16 (01)
[2]   Drug-induced liver injury [J].
Andrade, Raul J. ;
Chalasani, Naga ;
Bjornsson, Einar S. ;
Suzuki, Ayako ;
Kullak-Ublick, Gerd A. ;
Watkins, Paul B. ;
Devarbhavi, Harshad ;
Merz, Michael ;
Isabel Lucena, M. ;
Kaplowitz, Neil ;
Aithal, Guruprasad P. .
NATURE REVIEWS DISEASE PRIMERS, 2019, 5 (1)
[3]  
[Anonymous], 2009, Encyclopedia of Biometrics, P899, DOI [DOI 10.7717/PEERJ.10490/SUPP-14, /10.7717/peerj.10490/supp-14]
[4]   Key Challenges and Opportunities Associated with the Use of In Vitro Models to Detect Human DILI: Integrated Risk Assessment and Mitigation Plans [J].
Atienzar, Franck A. ;
Blomme, Eric A. ;
Chen, Minjun ;
Hewitt, Philip ;
Kenna, J. Gerry ;
Labbe, Gilles ;
Moulin, Frederic ;
Pognan, Francois ;
Roth, Adrian B. ;
Suter-Dick, Laura ;
Ukairo, Okechukwu ;
Weaver, Richard J. ;
Will, Yvonne ;
Dambach, Donna M. .
BIOMED RESEARCH INTERNATIONAL, 2016, 2016
[5]   Adverse event detection in drug development: Recommendations and obligations beyond phase 3 [J].
Berlin, Jesse A. ;
Glasser, Susan C. ;
Ellenberg, Susan S. .
AMERICAN JOURNAL OF PUBLIC HEALTH, 2008, 98 (08) :1366-1371
[6]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[7]   A Model to predict severity of drug-induced liver injury in humans [J].
Chen, Minjun ;
Borlak, Juergen ;
Tong, Weida .
HEPATOLOGY, 2016, 64 (03) :931-940
[8]   DILIrank: the largest reference drug list ranked by the risk for developing drug-induced liver injury in humans [J].
Chen, Minjun ;
Suzuki, Ayako ;
Thakkar, Shraddha ;
Yu, Ke ;
Hu, Chuchu ;
Tong, Weida .
DRUG DISCOVERY TODAY, 2016, 21 (04) :648-653
[9]   Predictability of drug-induced liver injury by machine learning [J].
Chierici, Marco ;
Francescatto, Margherita ;
Bussola, Nicole ;
Jurman, Giuseppe ;
Furlanello, Cesare .
BIOLOGY DIRECT, 2020, 15 (01)
[10]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297