SAEROF: an ensemble approach for large-scale drug-disease association prediction by incorporating rotation forest and sparse autoencoder deep neural network

被引:35
作者
Jiang, Han-Jing [1 ,2 ,3 ]
Huang, Yu-An [4 ]
You, Zhu-Hong [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi 830011, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Hung Hom, Hong Kong, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1038/s41598-020-61616-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Drug-disease association is an important piece of information which participates in all stages of drug repositioning. Although the number of drug-disease associations identified by high-throughput technologies is increasing, the experimental methods are time consuming and expensive. As supplement to them, many computational methods have been developed for an accurate in silico prediction for new drug-disease associations. In this work, we present a novel computational model combining sparse auto-encoder and rotation forest (SAEROF) to predict drug-disease association. Gaussian interaction profile kernel similarity, drug structure similarity and disease semantic similarity were extracted for exploring the association among drugs and diseases. On this basis, a rotation forest classifier based on sparse auto-encoder is proposed to predict the association between drugs and diseases. In order to evaluate the performance of the proposed model, we used it to implement 10-fold cross validation on two golden standard datasets, Fdataset and Cdataset. As a result, the proposed model achieved AUCs (Area Under the ROC Curve) of Fdataset and Cdataset are 0.9092 and 0.9323, respectively. For performance evaluation, we compared SAEROF with the state-of-the-art support vector machine (SVM) classifier and some existing computational models. Three human diseases (Obesity, Stomach Neoplasms and Lung Neoplasms) were explored in case studies. As a result, more than half of the top 20 drugs predicted were successfully confirmed by the Comparative Toxicogenomics Database(CTD database). This model is a feasible and effective method to predict drug-disease correlation, and its performance is significantly improved compared with existing methods.
引用
收藏
页数:11
相关论文
共 25 条
[1]  
[Anonymous], 1997, ADV NEURAL INFORM PR
[2]  
Bolton EE, 2010, ANN REP COMP CHEM, V4, P217, DOI 10.1016/S1574-1400(08)00012-1
[3]   SUPPORT-VECTOR NETWORKS [J].
CORTES, C ;
VAPNIK, V .
MACHINE LEARNING, 1995, 20 (03) :273-297
[4]  
Deng J., 2013, AFF COMP INT INT ACI, P511
[5]   PREDICT: a method for inferring novel drug indications with application to personalized medicine [J].
Gottlieb, Assaf ;
Stein, Gideon Y. ;
Ruppin, Eytan ;
Sharan, Roded .
MOLECULAR SYSTEMS BIOLOGY, 2011, 7
[6]   Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders [J].
Hamosh, A ;
Scott, AF ;
Amberger, J ;
Bocchini, C ;
Valle, D ;
McKusick, VA .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :52-55
[7]   Predicting Drug-Disease Associations via Using Gaussian Interaction Profile and Kernel-Based Autoencoder [J].
Jiang, Han-Jing ;
Huang, Yu-An ;
You, Zhu-Hong .
BIOMED RESEARCH INTERNATIONAL, 2019, 2019
[8]   ON INFORMATION AND SUFFICIENCY [J].
KULLBACK, S ;
LEIBLER, RA .
ANNALS OF MATHEMATICAL STATISTICS, 1951, 22 (01) :79-86
[9]   Micro injection of metallic glasses parts under ultrasonic vibration [J].
Liang, X. ;
Ma, J. ;
Wu, X. Y. ;
Xu, B. ;
Gong, F. ;
Lei, J. G. ;
Peng, T. J. ;
Cheng, R. .
JOURNAL OF MATERIALS SCIENCE & TECHNOLOGY, 2017, 33 (07) :703-707
[10]  
Lipscomb CE, 2000, B MED LIBR ASSOC, V88, P265