Predicting Protein-Protein Interactions based on Biological Information using Extreme Gradient Boosting

被引:10
作者
Beltran, Jerome Cary [1 ]
Valdez, Paolo [2 ]
Naval, Prospero, Jr. [1 ]
机构
[1] Univ Philippines Diliman, Dept Comp Sci, Quezon City, Philippines
[2] Univ Philippines Diliman, Elect & Elect Engn Inst, Quezon City, Philippines
来源
2019 16TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY - CIBCB 2019 | 2019年
关键词
Protein-Protein Interaction; Machine Learning; Ensemble Learning; Extreme Gradient Boosting; Support Vector Machine; Random Forest;
D O I
10.1109/cibcb.2019.8791241
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Protein-protein interactions (PPIs) are vital to numerous biological processes. Computational methods have been used to predict PPIs from protein sequences. Several studies utilize popular algorithms such as Support Vector Machines (SVM) and Random Forest (RF) for detecting PPIs. The hypothesis of this study is that Extreme Gradient Boosting (XGBoost), which uses gradient boosted decision trees as the base classifier, can produce comparable results to those produced by SVM and RF. Based on the experimental results for the assembled protein interaction dataset, XGBoost produced better results than SVM and RF for the majority of the metrics used.
引用
收藏
页码:346 / 351
页数:6
相关论文
共 24 条
[1]  
Apweiler R, 2004, NUCLEIC ACIDS RES, V32, pD115, DOI [10.1093/nar/gkw1099, 10.1093/nar/gkh131]
[2]  
Blohm P, 2014, NUCLEIC ACIDS RES, V42, P15
[3]   Predicting protein-protein interactions from primary structure [J].
Bock, JR ;
Gough, DA .
BIOINFORMATICS, 2001, 17 (05) :455-460
[4]   PPI_SVM: Prediction of protein-protein interactions using machine learning, domain-domain affinities and frequency tables [J].
Chatterjee, Piyali ;
Basu, Subhadip ;
Kundu, Mahantapas ;
Nasipuri, Mita ;
Plewczynski, Dariusz .
CELLULAR & MOLECULAR BIOLOGY LETTERS, 2011, 16 (02) :264-278
[5]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[6]  
Cortes C., 1995, MACH LEARN, V1995, P273, DOI DOI 10.1007/BF00994018
[7]   Building protein-protein interaction networks for Leishmania species through protein structural information [J].
dos Santos Vasconcelos, Crhisllane Rafaele ;
Campos, Tulio de Lima ;
Rezende, Antonio Mauro .
BMC BIOINFORMATICS, 2018, 19
[8]   A decision-theoretic generalization of on-line learning and an application to boosting [J].
Freund, Y ;
Schapire, RE .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) :119-139
[9]   A Bayesian networks approach for predicting protein-protein interactions from genomic data [J].
Jansen, R ;
Yu, HY ;
Greenbaum, D ;
Kluger, Y ;
Krogan, NJ ;
Chung, SB ;
Emili, A ;
Snyder, M ;
Greenblatt, JF ;
Gerstein, M .
SCIENCE, 2003, 302 (5644) :449-453
[10]   Machine Learning Approaches for Protein-Protein Interaction Hot Spot Prediction: Progress and Comparative Assessment [J].
Liu, Siyu ;
Liu, Chuyao ;
Deng, Lei .
MOLECULES, 2018, 23 (10)