Prediction of human disease-associated phosphorylation sites with combined feature selection approach and support vector machine

被引:11
作者
Xu, Xiaoyi [1 ]
Li, Ao [1 ,2 ]
Wang, Minghui [1 ,2 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, AH-230027 Hefei, Peoples R China
[2] Univ Sci & Technol China, Ctr Biomed Engn, AH-230027 Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
proteins; cellular biophysics; diseases; support vector machines; feature selection; filtering theory; medical computing; bioinformatics; forward feature selection process; minimum-redundancy-maximum-relevance filtering process; cellular process; post-translational modification; support vector machine; human disease-associated phosphorylation sites; PROTEIN-PHOSPHORYLATION; PATTERN-RECOGNITION; IDENTIFICATION; SEQUENCE;
D O I
10.1049/iet-syb.2014.0051
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Phosphorylation is a crucial post-translational modification, which regulates almost all cellular processes in life. It has long been recognised that protein phosphorylation has close relationship with diseases, and therefore many researches are undertaken to predict phosphorylation sites for disease treatment and drug design. However, despite the success achieved by these approaches, no method focuses on disease-associated phosphorylation sites prediction. Herein, for the first time the authors propose a novel approach that is specially designed to identify associations between phosphorylation sites and human diseases. To take full advantage of local sequence information, a combined feature selection method-based support vector machine (CFS-SVM) that incorporates minimum-redundancy-maximum-relevance filtering process and forward feature selection process is developed. Performance evaluation shows that CFS-SVM is significantly better than the widely used classifiers including Bayesian decision theory, k nearest neighbour and random forest. With the extremely high specificity of 99%, CFS-SVM can still achieve a high sensitivity. Besides, tests on extra data confirm the effectiveness and general applicability of CFS-SVM approach on a variety of diseases. Finally, the analysis of selected features and corresponding kinases also help the understanding of the potential mechanism of disease-phosphorylation relationships and guide further experimental validations.
引用
收藏
页码:155 / 163
页数:9
相关论文
共 50 条
  • [21] Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods
    Huseyin Polat
    Homay Danaei Mehr
    Aydin Cetin
    Journal of Medical Systems, 2017, 41
  • [22] Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods
    Polat, Huseyin
    Mehr, Homay Danaei
    Cetin, Aydin
    JOURNAL OF MEDICAL SYSTEMS, 2017, 41 (04)
  • [23] Stock trend prediction based on fractal feature selection and support vector machine
    Ni, Li-Ping
    Ni, Zhi-Wei
    Gao, Ya-Zhuo
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 5569 - 5576
  • [24] A novel feature selection method for twin support vector machine
    Bai, Lan
    Wang, Zhen
    Shao, Yuan-Hai
    Deng, Nai-Yang
    KNOWLEDGE-BASED SYSTEMS, 2014, 59 : 1 - 8
  • [25] Sparse Support Vector Machine with Lp Penalty for Feature Selection
    Lan Yao
    Feng Zeng
    Dong-Hui Li
    Zhi-Gang Chen
    Journal of Computer Science and Technology, 2017, 32 : 68 - 77
  • [26] On domain knowledge and feature selection using a support vector machine
    Barzilay, O
    Brailovsky, VL
    PATTERN RECOGNITION LETTERS, 1999, 20 (05) : 475 - 484
  • [27] Exploring Feature Selection and Support Vector Machine in Text Categorization
    Abdul-Rahman, Shuzlina
    Mutalib, Sofianita
    Khanafi, Nur Amira
    Ali, Azliza Mohd
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1101 - 1104
  • [28] Prediction of cis/trans isomerization using feature selection and support vector machines
    Exarchos, Konstantinos P.
    Papaloukas, Costas
    Exarchos, Themis P.
    Troganis, Anastassios N.
    Fotiadis, Dimitrios I.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (01) : 140 - 149
  • [29] Prediction of Protein Phosphorylation Sites by Support Vector Machines
    Ishino, Tomoki
    Nishikawa, Ikuko
    Fukuchi, Satoshi
    Tohsato, Yukako
    Nishikawa, Ken
    PROCEEDINGS OF THE 2013 6TH INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS (BMEI 2013), VOLS 1 AND 2, 2013, : 817 - 821
  • [30] An integrated approach of feature selection and parameter optimisation of kernel to enhance the performance of support vector machine
    Sarojini, Balakrishnan
    INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2015, 15 (2-3) : 265 - 278