Prediction of human disease-associated phosphorylation sites with combined feature selection approach and support vector machine

被引:12
作者
Xu, Xiaoyi [1 ]
Li, Ao [1 ,2 ]
Wang, Minghui [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, AH-230027 Hefei, Peoples R China
[2] Univ Sci & Technol China, Ctr Biomed Engn, AH-230027 Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
proteins; cellular biophysics; diseases; support vector machines; feature selection; filtering theory; medical computing; bioinformatics; forward feature selection process; minimum-redundancy-maximum-relevance filtering process; cellular process; post-translational modification; support vector machine; human disease-associated phosphorylation sites; PROTEIN-PHOSPHORYLATION; PATTERN-RECOGNITION; IDENTIFICATION; SEQUENCE;
D O I
10.1049/iet-syb.2014.0051
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Phosphorylation is a crucial post-translational modification, which regulates almost all cellular processes in life. It has long been recognised that protein phosphorylation has close relationship with diseases, and therefore many researches are undertaken to predict phosphorylation sites for disease treatment and drug design. However, despite the success achieved by these approaches, no method focuses on disease-associated phosphorylation sites prediction. Herein, for the first time the authors propose a novel approach that is specially designed to identify associations between phosphorylation sites and human diseases. To take full advantage of local sequence information, a combined feature selection method-based support vector machine (CFS-SVM) that incorporates minimum-redundancy-maximum-relevance filtering process and forward feature selection process is developed. Performance evaluation shows that CFS-SVM is significantly better than the widely used classifiers including Bayesian decision theory, k nearest neighbour and random forest. With the extremely high specificity of 99%, CFS-SVM can still achieve a high sensitivity. Besides, tests on extra data confirm the effectiveness and general applicability of CFS-SVM approach on a variety of diseases. Finally, the analysis of selected features and corresponding kinases also help the understanding of the potential mechanism of disease-phosphorylation relationships and guide further experimental validations.
引用
收藏
页码:155 / 163
页数:9
相关论文
共 50 条
[31]   A reliable method for colorectal cancer prediction based on feature selection and support vector machine [J].
Dandan Zhao ;
Hong Liu ;
Yuanjie Zheng ;
Yanlin He ;
Dianjie Lu ;
Chen Lyu .
Medical & Biological Engineering & Computing, 2019, 57 :901-912
[32]   A reliable method for colorectal cancer prediction based on feature selection and support vector machine [J].
Zhao, Dandan ;
Liu, Hong ;
Zheng, Yuanjie ;
He, Yanlin ;
Lu, Dianjie ;
Lyu, Chen .
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2019, 57 (04) :901-912
[33]   BLProt: prediction of bioluminescent proteins based on support vector machine and relieff feature selection [J].
Krishna Kumar Kandaswamy ;
Ganesan Pugalenthi ;
Mehrnaz Khodam Hazrati ;
Kai-Uwe Kalies ;
Thomas Martinetz .
BMC Bioinformatics, 12
[34]   AutoMotif Server for prediction of phosphorylation sites in proteins using support vector machine: 2007 update [J].
Plewczynski, Dariusz ;
Tkacz, Adrian ;
Wyrwicz, Lucjan S. ;
Rychlewski, Leszek ;
Ginalski, Krzysztof .
JOURNAL OF MOLECULAR MODELING, 2008, 14 (01) :69-76
[35]   Group feature selection with multiclass support vector machine [J].
Tang, Fengzhen ;
Adam, Lukas ;
Si, Bailu .
NEUROCOMPUTING, 2018, 317 :42-49
[36]   AutoMotif Server for prediction of phosphorylation sites in proteins using support vector machine: 2007 update [J].
Dariusz Plewczynski ;
Adrian Tkacz ;
Lucjan S. Wyrwicz ;
Leszek Rychlewski ;
Krzysztof Ginalski .
Journal of Molecular Modeling, 2008, 14 :69-76
[37]   PhosphoSVM: prediction of phosphorylation sites by integrating various protein sequence attributes with a support vector machine [J].
Yongchao Dou ;
Bo Yao ;
Chi Zhang .
Amino Acids, 2014, 46 :1459-1469
[38]   Feature clustering based support vector machine recursive feature elimination for gene selection [J].
Huang, Xiaojuan ;
Zhang, Li ;
Wang, Bangjun ;
Li, Fanzhang ;
Zhang, Zhao .
APPLIED INTELLIGENCE, 2018, 48 (03) :594-607
[39]   The Improved Particle Swarm Optimization for Feature Selection of Support Vector Machine [J].
Wang, Sipeng ;
Ding, Sheng .
PROCEEDINGS OF 2017 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION AND INFORMATION SYSTEMS (ICCIS 2017), 2015, :314-317
[40]   The research on the method of feature selection in support vector Machine based Entropy [J].
Zhu, Xiaoyan ;
Tian, Xi ;
Zhu, Xiaoxun .
PROGRESS IN POWER AND ELECTRICAL ENGINEERING, PTS 1 AND 2, 2012, 354-355 :1192-+