SVM-RFE based feature selection for tandem mass spectrum quality assessment

被引:27
作者
Ding, Jiarui [1 ]
Shi, Jinhong [1 ]
Wu, Fang-Xiang [1 ]
机构
[1] Univ Saskatchewan, Div Biomed Engn, Dept Mech Engn, Saskatoon, SK S7N 5A9, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
feature selection; SVM-RFE; tandem mass spectra; quality assessment; proteomics; SUPPORT VECTOR MACHINES; SPECTROMETRY-BASED PROTEOMICS; GENE SELECTION; IDENTIFICATION; CLASSIFICATION; INFORMATION; THROUGHPUT; PREDICTION; PEPTIDES;
D O I
10.1504/IJDMB.2011.038578
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In literature, hundreds of features have been proposed to assess the quality of tandem mass spectra. However, many of these features are irrelevant in describing the spectrum quality and they can degenerate the spectrum quality assessment performance. We propose a two-stage Recursive Feature Elimination based on Support Vector Machine (SVM-RFE) method to select the highly relevant features from those collected in literature. Classifiers are trained to verify the relevance of selected features. The results demonstrate that these selected features can better describe the quality of tandem mass spectra and hence improve the performance of tandem mass spectrum quality assessment.
引用
收藏
页码:73 / 88
页数:16
相关论文
共 29 条
  • [1] Mass spectrometry-based proteomics
    Aebersold, R
    Mann, M
    [J]. NATURE, 2003, 422 (6928) : 198 - 207
  • [2] BAGINSKY S, 2002, 383 ETH ZUR DEP COMP
  • [3] Automatic Quality Assessment of Peptide Tandem Mass Spectra
    Bern, Marshall
    Goldberg, David
    McDonald, W. Hayes
    Yates, John R., III
    [J]. BIOINFORMATICS, 2004, 20 : 49 - 54
  • [4] DING J, 2009, THESIS U SASKTCHEWAN
  • [5] Duan K., 2005, P 3 AS PAC BIOINF C
  • [6] Fan RE, 2005, J MACH LEARN RES, V6, P1889
  • [7] Improving the reliability and throughput of mass spectrometry-based proteomics by spectrum quality filtering
    Flikka, K
    Martens, L
    Vandekerckhoe, J
    Gevaert, K
    Eidhammer, I
    [J]. PROTEOMICS, 2006, 6 (07) : 2086 - 2094
  • [8] Gene selection for cancer classification using support vector machines
    Guyon, I
    Weston, J
    Barnhill, S
    Vapnik, V
    [J]. MACHINE LEARNING, 2002, 46 (1-3) : 389 - 422
  • [9] Guyon I., 2003, J MACH LEARN RES, V3, P1157
  • [10] Support vector machine approach for protein subcellular localization prediction
    Hua, SJ
    Sun, ZR
    [J]. BIOINFORMATICS, 2001, 17 (08) : 721 - 728