NegStacking: Drug-Target Interaction Prediction Based on Ensemble Learning and Logistic Regression

被引:12
作者
Yang, Jie [1 ]
He, Song [2 ]
Zhang, Zhongnan [1 ]
Bo, Xiaochen [2 ]
机构
[1] Xiamen Univ, Sch Informat, Xiamen 361005, Fujian, Peoples R China
[2] Beijing Inst Radiat Med, Beijing 100850, Peoples R China
关键词
Drugs; Training data; Proteins; Predictive models; Databases; Diffusion tensor imaging; Machine learning; Drug-target interactions; class imbalance; machine learning; ensemble learning; stacked generalization; PROTEINS;
D O I
10.1109/TCBB.2020.2968025
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Drug-target interactions (DTIs) identification is an important issue of drug research, and many methods proposed to predict potential DTIs based on machine learning treat it as a binary classification problem. However, the number of known interacting drug-target pairs (positive samples) is far less than that of non-interacting pairs (negative samples). Most methods do not utilize these large numbers of negative samples sufficiently, which limits their prediction performance. To address this problem, we proposed a stacking framework named NegStacking. First, it uses sampling to obtain multiple completely different negative sample sets. Then, each weak learner is trained with a different negative sample set and the same positive sample set, and the logistic regression (LR) is used as a meta-learner to adaptively combine these weak learners. Moreover, in the training process, feature subspacing and hyperparameter perturbation are applied to increase ensemble diversity. Finally, the trained model could be used to predict new samples. We compared NegStacking with other methods, and the experimental results show that our model is superior. NegStacking can improve the performance of predictive DTIs, and it has broad application prospects for improving the drug discovery process. The source code and datasets are available at https://github.com/Open-ss/NegStacking.
引用
收藏
页码:2624 / 2634
页数:11
相关论文
共 34 条
  • [1] SMOTE for high-dimensional class-imbalanced data
    Blagus, Rok
    Lusa, Lara
    [J]. BMC BIOINFORMATICS, 2013, 14
  • [2] Supervised prediction of drug-target interactions using bipartite local models
    Bleakley, Kevin
    Yamanishi, Yoshihiro
    [J]. BIOINFORMATICS, 2009, 25 (18) : 2397 - 2403
  • [3] SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
    Blewitt, Marnie E.
    Gendrel, Anne-Valerie
    Pang, Zhenyi
    Sparrow, Duncan B.
    Whitelaw, Nadia
    Craig, Jeffrey M.
    Apedaile, Anwyn
    Hilton, Douglas J.
    Dunwoodie, Sally L.
    Brockdorff, Neil
    Kay, Graham F.
    Whitelaw, Emma
    [J]. NATURE GENETICS, 2008, 40 (05) : 663 - 669
  • [4] TTD: Therapeutic Target Database
    Chen, X
    Ji, ZL
    Chen, YZ
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 412 - 415
  • [5] Drug-target interaction prediction by random walk on the heterogeneous network
    Chen, Xing
    Liu, Ming-Xi
    Yan, Gui-Ying
    [J]. MOLECULAR BIOSYSTEMS, 2012, 8 (07) : 1970 - 1978
  • [6] Chen Y. Z., 2015, PROTEINS STRUCT FUNC, V43, P217
  • [7] Predicting Drug-target Interaction via Wide and Deep Learning
    Du, Yingyi
    Wang, Jihong
    Wang, Xiaodan
    Chen, Jiyun
    Chang, Huiyou
    [J]. PROCEEDINGS OF 2018 6TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (ICBCB 2018), 2018, : 128 - 132
  • [8] Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey
    Ezzat, Ali
    Wu, Min
    Li, Xiao-Li
    Kwoh, Chee-Keong
    [J]. BRIEFINGS IN BIOINFORMATICS, 2019, 20 (04) : 1337 - 1357
  • [9] Drug-target interaction prediction using ensemble learning and dimensionality reduction
    Ezzat, Ali
    Wu, Min
    Li, Xiao-Li
    Kwoh, Chee-Keong
    [J]. METHODS, 2017, 129 : 81 - 88
  • [10] Drug-target interaction prediction via class imbalance-aware ensemble learning
    Ezzat, Ali
    Wu, Min
    Li, Xiao-Li
    Kwoh, Chee-Keong
    [J]. BMC BIOINFORMATICS, 2016, 17