Prediction of hERG potassium channel blockage using ensemble learning methods and molecular fingerprints

被引:25
作者
Liu, Miao [1 ]
Zhang, Li [1 ,2 ,3 ]
Li, Shimeng [1 ]
Yang, Tianzhou [1 ]
Liu, Lili [1 ]
Zhao, Jian [1 ]
Liu, Hongsheng [1 ,2 ,3 ]
机构
[1] Liaoning Univ, Sch Life Sci, 66 Chongshan Zhonglu, Shenyang 110036, Liaoning, Peoples R China
[2] Liaoning Univ, Res Ctr Comp Simulating & Informat Proc Biomacrom, Shenyang 110036, Peoples R China
[3] Liaoning Univ, Engn Lab Mol Simulat & Designing Drug Mol Liaonin, Shenyang 110036, Peoples R China
关键词
hERG; Molecular fingerprint; Molecular descriptor; Machine learning; Ensemble model; IN-SILICO PREDICTION; CLASSIFICATION MODELS; K+ CHANNEL; ADMET EVALUATION; BLOCKERS; STRATEGIES; MUTATIONS;
D O I
10.1016/j.toxlet.2020.07.003
中图分类号
R99 [毒物学(毒理学)];
学科分类号
100405 ;
摘要
The human ether-a-go-go-related gene (hERG) encodes a tetrameric potassium channel called Kv11.1. This channel can be blocked by certain drugs, which leads to long QT syndrome, causing cardiotoxicity. This is a significant problem during drug development. Using computer models to predict compound cardiotoxicity during the early stages of drug design will help to solve this problem. In this study, we used a dataset of 1865 compounds exhibiting known hERG inhibitory activities as a training set. Thirty cardiotoxicity classification models were established using three machine learning algorithms based on molecular fingerprints and molecular descriptors. Through using these models as the base classifier, a new cardiotoxicity classification model with better predictive performance was developed using ensemble learning method. The accuracy of the best base classifier, which was generated using the XGBoost method with molecular descriptors, was 84.8 %, and the area under the receiver-operating characteristic curve (AUC) was 0.876 in the five fold cross-validation. However, all of the ensemble models that we developed had higher predictive performance than the base classifiers in the five fold cross-validation. The best predictive performance was achieved by the Ensemble-Top7 model, with accuracy of 84.9 % and AUC of 0.887. We also tested the ensemble model using external validation data and achieved accuracy of 85.0 % and AUC of 0.786. Furthermore, we identified several hERG-related substructures, which provide valuable information for designing drug candidates.
引用
收藏
页码:88 / 96
页数:9
相关论文
共 37 条
[1]   A model for identifying HERG K+ channel blockers [J].
Aronov, AM ;
Goldman, BB .
BIOORGANIC & MEDICINAL CHEMISTRY, 2004, 12 (09) :2307-2315
[2]   Predictive in silico modeling for hERG channelblockers [J].
Aronov, MM .
DRUG DISCOVERY TODAY, 2005, 10 (02) :149-155
[3]   Integrated Analysis of Drug-Induced Gene Expression Profiles Predicts Novel hERG Inhibitors [J].
Babcock, Joseph J. ;
Du, Fang ;
Xu, Kaiping ;
Wheelan, Sarah J. ;
Li, Min .
PLOS ONE, 2013, 8 (07)
[4]   Nongenotoxic Carcinogenicity of Chemicals: Mechanisms of Action and Early Recognition through a New Set of Structural Alerts [J].
Benigni, Romualdo ;
Bossa, Cecilia ;
Tcheremenskaia, Olga .
CHEMICAL REVIEWS, 2013, 113 (05) :2940-2957
[5]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[6]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[7]   Sudden death associated with short-QT syndrome linked to mutations in HERG [J].
Brugada, R ;
Hong, K ;
Dumaine, R ;
Cordeiro, J ;
Gaita, F ;
Borggrefe, M ;
Menendez, TM ;
Brugada, J ;
Pollevick, GD ;
Wolpert, C ;
Burashnikov, E ;
Matsuo, K ;
Wu, YS ;
Guerchicoff, A ;
Bianchi, F ;
Giustetto, C ;
Schimpf, R ;
Brugada, P ;
Antzelevitch, C .
CIRCULATION, 2004, 109 (01) :30-35
[8]  
Chen W, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING AND ROBOTICS (IEEE RCAR), P22, DOI 10.1109/RCAR.2016.7783995
[9]   In Silico Assessment of Chemical Biodegradability [J].
Cheng, Feixiong ;
Ikenaga, Yutaka ;
Zhou, Yadi ;
Yu, Yue ;
Li, Weihua ;
Shen, Jie ;
Du, Zheng ;
Chen, Lei ;
Xu, Congying ;
Liu, Guixia ;
Lee, Philip W. ;
Tang, Yun .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2012, 52 (03) :655-669
[10]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297