NegStacking: Drug-Target Interaction Prediction Based on Ensemble Learning and Logistic Regression

被引:15
作者
Yang, Jie [1 ]
He, Song [2 ]
Zhang, Zhongnan [1 ]
Bo, Xiaochen [2 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen 361005, Fujian, Peoples R China
[2] Beijing Inst Radiat Med, Beijing 100850, Peoples R China
关键词
Drugs; Training data; Proteins; Predictive models; Databases; Diffusion tensor imaging; Machine learning; Drug-target interactions; class imbalance; machine learning; ensemble learning; stacked generalization; PROTEINS;
D O I
10.1109/TCBB.2020.2968025
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Drug-target interactions (DTIs) identification is an important issue of drug research, and many methods proposed to predict potential DTIs based on machine learning treat it as a binary classification problem. However, the number of known interacting drug-target pairs (positive samples) is far less than that of non-interacting pairs (negative samples). Most methods do not utilize these large numbers of negative samples sufficiently, which limits their prediction performance. To address this problem, we proposed a stacking framework named NegStacking. First, it uses sampling to obtain multiple completely different negative sample sets. Then, each weak learner is trained with a different negative sample set and the same positive sample set, and the logistic regression (LR) is used as a meta-learner to adaptively combine these weak learners. Moreover, in the training process, feature subspacing and hyperparameter perturbation are applied to increase ensemble diversity. Finally, the trained model could be used to predict new samples. We compared NegStacking with other methods, and the experimental results show that our model is superior. NegStacking can improve the performance of predictive DTIs, and it has broad application prospects for improving the drug discovery process. The source code and datasets are available at https://github.com/Open-ss/NegStacking.
引用
收藏
页码:2624 / 2634
页数:11
相关论文
共 34 条
[1]   SMOTE for high-dimensional class-imbalanced data [J].
Blagus, Rok ;
Lusa, Lara .
BMC BIOINFORMATICS, 2013, 14
[2]   Supervised prediction of drug-target interactions using bipartite local models [J].
Bleakley, Kevin ;
Yamanishi, Yoshihiro .
BIOINFORMATICS, 2009, 25 (18) :2397-2403
[3]  
Breiman L., 1984, Classification and Regression Trees, DOI [DOI 10.1201/9781315139470, 10.1201/9781315139470]
[4]   TTD: Therapeutic Target Database [J].
Chen, X ;
Ji, ZL ;
Chen, YZ .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :412-415
[5]   Drug-target interaction prediction by random walk on the heterogeneous network [J].
Chen, Xing ;
Liu, Ming-Xi ;
Yan, Gui-Ying .
MOLECULAR BIOSYSTEMS, 2012, 8 (07) :1970-1978
[6]  
Chen Y. Z., 2015, PROTEINS STRUCT FUNC, V43, P217
[7]   Predicting Drug-target Interaction via Wide and Deep Learning [J].
Du, Yingyi ;
Wang, Jihong ;
Wang, Xiaodan ;
Chen, Jiyun ;
Chang, Huiyou .
PROCEEDINGS OF 2018 6TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (ICBCB 2018), 2018, :128-132
[8]   Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey [J].
Ezzat, Ali ;
Wu, Min ;
Li, Xiao-Li ;
Kwoh, Chee-Keong .
BRIEFINGS IN BIOINFORMATICS, 2019, 20 (04) :1337-1357
[9]   Drug-target interaction prediction using ensemble learning and dimensionality reduction [J].
Ezzat, Ali ;
Wu, Min ;
Li, Xiao-Li ;
Kwoh, Chee-Keong .
METHODS, 2017, 129 :81-88
[10]   Drug-target interaction prediction via class imbalance-aware ensemble learning [J].
Ezzat, Ali ;
Wu, Min ;
Li, Xiao-Li ;
Kwoh, Chee-Keong .
BMC BIOINFORMATICS, 2016, 17