DrugMiner: comparative analysis of machine learning algorithms for prediction of potential druggable proteins

被引:73
作者
Jamali, Ali Akbar [1 ]
Ferdousi, Reza [2 ]
Razzaghi, Saeed [3 ]
Li, Jiuyong [4 ]
Safdari, Reza [2 ]
Ebrahimie, Esmaeil [4 ,5 ,6 ]
机构
[1] Tabriz Univ Med Sci, RCPN, Tabriz, Iran
[2] Univ Tehran Med Sci, Sch Allied Med Sci, Dept Hlth Informat Management, Tehran, Iran
[3] Univ Zanjan, Informat Technol Ctr, Zanjan, Iran
[4] Univ S Australia, Sch Informat Technol & Math Sci, Div Informat Technol Engn & Environm, Adelaide, SA 5001, Australia
[5] Univ Adelaide, Sch Biol Sci, Dept Genet & Evolut, Adelaide, SA, Australia
[6] Flinders Univ S Australia, Fac Sci & Engn, Sch Biol Sci, Adelaide, SA 5001, Australia
关键词
ACID SIDE-CHAINS; DRUG DISCOVERY; SYSTEMS BIOLOGY; TARGETS; FEATURES; TRANSPORTERS; EXPLORATION; ATTRIBUTES; SOLUBILITY; ORGANISMS;
D O I
10.1016/j.drudis.2016.01.007
中图分类号
R9 [药学];
学科分类号
1007 ;
摘要
Application of computational methods in drug discovery has received increased attention in recent years as a way to accelerate drug target prediction. Based on 443 sequence-derived protein features, we applied the most commonly used machine learning methods to predict whether a protein is druggable as well as to opt for superior algorithm in this task. In addition, feature selection procedures were used to provide the best performance of each classifier according to the optimum number of features. When run on all features, Neural Network was the best classifier, with 89.98% accuracy, based on a k-fold cross-validation test. Among all the algorithms applied, the optimum number of most-relevant features was 130, according to the Support Vector Machine-Feature Selection (SVM-FS) algorithm. This study resulted in the discovery of new drug target which potentially can be employed in cell signaling pathways, gene expression, and signal transduction. The DrugMiner web tool was developed based on the findings of this study to provide researchers with the ability to predict druggable proteins. DrugMiner is freely available at www.DrugMiner.org.
引用
收藏
页码:718 / 724
页数:7
相关论文
共 47 条
[1]   Accurate prediction of protein structural classes using functional domains and predicted secondary structure sequences [J].
Adl, Amin Ahmadi ;
Nowzari-Dalini, Abbas ;
Xue, Bin ;
Uversky, Vladimir N. ;
Qian, Xiaoning .
JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2012, 29 (06) :623-633
[2]  
[Anonymous], 2013, RapidMiner: Data Mining Use Cases and Business Analytics Applications
[3]   Amino Acid Features of P1B-ATPase Heavy Metal Transporters Enabling Small Numbers of Organisms to Cope with Heavy Metal Pollution [J].
Ashrafi, E. ;
Alemzadeh, A. ;
Ebrahimi, M. ;
Ebrahimie, E. ;
Dadkhodaei, N. .
BIOINFORMATICS AND BIOLOGY INSIGHTS, 2011, 5 :59-82
[4]   Properties and identification of human protein drug targets [J].
Bakheet, Tala M. ;
Doig, Andrew J. .
BIOINFORMATICS, 2009, 25 (04) :451-457
[5]   Neural network and SVM classifiers accurately predict lipid binding proteins, irrespective of sequence homology [J].
Bakhtiarizadeh, Mohammad Reza ;
Moradi-Shahrbabak, Mohammad ;
Ebrahimi, Mansour ;
Ebrahimie, Esmaeil .
JOURNAL OF THEORETICAL BIOLOGY, 2014, 356 :213-222
[6]   UniProt: a hub for protein information [J].
Bateman, Alex ;
Martin, Maria Jesus ;
O'Donovan, Claire ;
Magrane, Michele ;
Apweiler, Rolf ;
Alpi, Emanuele ;
Antunes, Ricardo ;
Arganiska, Joanna ;
Bely, Benoit ;
Bingley, Mark ;
Bonilla, Carlos ;
Britto, Ramona ;
Bursteinas, Borisas ;
Chavali, Gayatri ;
Cibrian-Uhalte, Elena ;
Da Silva, Alan ;
De Giorgi, Maurizio ;
Dogan, Tunca ;
Fazzini, Francesco ;
Gane, Paul ;
Cas-tro, Leyla Garcia ;
Garmiri, Penelope ;
Hatton-Ellis, Emma ;
Hieta, Reija ;
Huntley, Rachael ;
Legge, Duncan ;
Liu, Wudong ;
Luo, Jie ;
MacDougall, Alistair ;
Mutowo, Prudence ;
Nightin-gale, Andrew ;
Orchard, Sandra ;
Pichler, Klemens ;
Poggioli, Diego ;
Pundir, Sangya ;
Pureza, Luis ;
Qi, Guoying ;
Rosanoff, Steven ;
Saidi, Rabie ;
Sawford, Tony ;
Shypitsyna, Aleksandra ;
Turner, Edward ;
Volynkin, Vladimir ;
Wardell, Tony ;
Watkins, Xavier ;
Zellner, Hermann ;
Cowley, Andrew ;
Figueira, Luis ;
Li, Weizhong ;
McWilliam, Hamish .
NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) :D204-D212
[7]   Systems biology in drug discovery and development [J].
Berg, Ellen L. .
DRUG DISCOVERY TODAY, 2014, 19 (02) :113-125
[8]  
Delavari Azar, 2014, Biotechnologia (Poznan), V95, P161
[9]   Understanding the Underlying Mechanism of HA-Subtyping in the Level of Physic-Chemical Characteristics of Protein [J].
Ebrahimi, Mansour ;
Aghagolzadeh, Parisa ;
Shamabadi, Narges ;
Tahmasebi, Ahmad ;
Alsharifi, Mohammed ;
Adelson, David L. ;
Hemmatzadeh, Farhid ;
Ebrahimie, Esmaeil .
PLOS ONE, 2014, 9 (05)
[10]   Prediction of Thermostability from Amino Acid Attributes by Combination of Clustering with Attribute Weighting: A New Vista in Engineering Enzymes [J].
Ebrahimi, Mansour ;
Lakizadeh, Amir ;
Agha-Golzadeh, Parisa ;
Ebrahimie, Esmaeil ;
Ebrahimi, Mahdi .
PLOS ONE, 2011, 6 (08)