Prediction of human disease-associated phosphorylation sites with combined feature selection approach and support vector machine

被引:12
作者
Xu, Xiaoyi [1 ]
Li, Ao [1 ,2 ]
Wang, Minghui [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, AH-230027 Hefei, Peoples R China
[2] Univ Sci & Technol China, Ctr Biomed Engn, AH-230027 Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
proteins; cellular biophysics; diseases; support vector machines; feature selection; filtering theory; medical computing; bioinformatics; forward feature selection process; minimum-redundancy-maximum-relevance filtering process; cellular process; post-translational modification; support vector machine; human disease-associated phosphorylation sites; PROTEIN-PHOSPHORYLATION; PATTERN-RECOGNITION; IDENTIFICATION; SEQUENCE;
D O I
10.1049/iet-syb.2014.0051
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Phosphorylation is a crucial post-translational modification, which regulates almost all cellular processes in life. It has long been recognised that protein phosphorylation has close relationship with diseases, and therefore many researches are undertaken to predict phosphorylation sites for disease treatment and drug design. However, despite the success achieved by these approaches, no method focuses on disease-associated phosphorylation sites prediction. Herein, for the first time the authors propose a novel approach that is specially designed to identify associations between phosphorylation sites and human diseases. To take full advantage of local sequence information, a combined feature selection method-based support vector machine (CFS-SVM) that incorporates minimum-redundancy-maximum-relevance filtering process and forward feature selection process is developed. Performance evaluation shows that CFS-SVM is significantly better than the widely used classifiers including Bayesian decision theory, k nearest neighbour and random forest. With the extremely high specificity of 99%, CFS-SVM can still achieve a high sensitivity. Besides, tests on extra data confirm the effectiveness and general applicability of CFS-SVM approach on a variety of diseases. Finally, the analysis of selected features and corresponding kinases also help the understanding of the potential mechanism of disease-phosphorylation relationships and guide further experimental validations.
引用
收藏
页码:155 / 163
页数:9
相关论文
共 50 条
[41]   Adaptive feature selection via a new version of support vector machine [J].
Tan, Junyan ;
Zhang, Zhiqiang ;
Zhen, Ling ;
Zhang, Chunhua ;
Deng, Naiyang .
NEURAL COMPUTING & APPLICATIONS, 2013, 23 (3-4) :937-945
[42]   Reseach on Feature Selection Algorithm Based on the margin of Support Vector Machine [J].
Hu, Linfang ;
Qiao, Lei ;
Huang, Minde .
MEASUREMENT TECHNOLOGY AND ENGINEERING RESEARCHES IN INDUSTRY, PTS 1-3, 2013, 333-335 :1430-1434
[43]   A Hybrid Kernel Support Vector Machine with Feature Selection for the Diagnosis of Diseases [J].
Tania, Farjana Akter ;
Shill, Pintu Chandra .
2019 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL INFORMATION AND COMMUNICATION TECHNOLOGY (EICT), 2019,
[44]   Mixed integer linear programming for feature selection in support vector machine [J].
Labbe, Martine ;
Martinez-Merino, Luisa I. ;
Rodriguez-Chia, Antonio M. .
DISCRETE APPLIED MATHEMATICS, 2019, 261 :276-304
[45]   Adaptive feature selection via a new version of support vector machine [J].
Junyan Tan ;
Zhiqiang Zhang ;
Ling Zhen ;
Chunhua Zhang ;
Naiyang Deng .
Neural Computing and Applications, 2013, 23 :937-945
[46]   Feature Selection Method Based on Mutual Information and Support Vector Machine [J].
Liu, Gang ;
Yang, Chunlei ;
Liu, Sen ;
Xiao, Chunbao ;
Song, Bin .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (06)
[47]   Feature clustering based support vector machine recursive feature elimination for gene selection [J].
Xiaojuan Huang ;
Li Zhang ;
Bangjun Wang ;
Fanzhang Li ;
Zhao Zhang .
Applied Intelligence, 2018, 48 :594-607
[48]   Sparse Support Vector Machine with L p Penalty for Feature Selection [J].
Yao, Lan ;
Zeng, Feng ;
Li, Dong-Hui ;
Chen, Zhi-Gang .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2017, 32 (01) :68-77
[49]   Support vector machines combined with feature selection for breast cancer diagnosis [J].
Akay, Mehmet Fatih .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :3240-3247
[50]   Analysis and Prediction of Nitrated Tyrosine Sites with the mRMR Method and Support Vector Machine Algorithm [J].
Wang, Shao Peng ;
Zhang, Qing ;
Lu, Jing ;
Cai, Yu-Dong .
CURRENT BIOINFORMATICS, 2018, 13 (01) :3-13