Feature extraction and dimensionality reduction for mass spectrometry data

被引:26
作者
Liu, Yihui [1 ]
机构
[1] Shandong Inst Light Ind, Sch Comp Sci & Informat Technol, Jinan 250353, Shandong, Peoples R China
关键词
Mass spectrometry data; Feature extraction; Wavelet analysis; Support vector machine; PROTEOMIC PATTERNS; SERUM; DECOMPOSITION;
D O I
10.1016/j.compbiomed.2009.06.012
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Mass spectrometry is being used to generate protein profiles from human serum, and proteomic data obtained from mass spectrometry have attracted great interest for the detection of early stage cancer. However, high dimensional mass spectrometry data cause considerable challenges. In this paper we propose a feature extraction algorithm based on wavelet analysis for high dimensional mass spectrometry data. A set of wavelet detail coefficients at different scale is used to detect the transient changes of mass spectrometry data. The experiments are performed on 2 datasets. A highly competitive accuracy, compared with the best performance of other kinds of classification models. is achieved. Experimental results show that the wavelet detail coefficients are efficient way to characterize features of high dimensional mass spectra and reduce the dimensionality of high dimensional mass spectra. (C) 2009 Elsevier Ltd. All rights reserved.
引用
收藏
页码:818 / 823
页数:6
相关论文
共 21 条
[1]  
Burges C.J.C., 1998, TUTORIAL SUPPORT VEC
[2]   ORTHONORMAL BASES OF COMPACTLY SUPPORTED WAVELETS [J].
DAUBECHIES, I .
COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, 1988, 41 (07) :909-996
[3]   DECOMPOSITION OF HARDY FUNCTIONS INTO SQUARE INTEGRABLE WAVELETS OF CONSTANT SHAPE [J].
GROSSMANN, A ;
MORLET, J .
SIAM JOURNAL ON MATHEMATICAL ANALYSIS, 1984, 15 (04) :723-736
[4]   Cancer proteomics: The state of the art [J].
Herrmann, PC ;
Liotta, LA ;
Petricoin, EF .
DISEASE MARKERS, 2001, 17 (02) :49-57
[5]  
*IEEE DSP COMM, 1979, PROGR DIG SIGN PROC
[6]   Performance of a genetic algorithm for mass spectrometry proteomics [J].
Jeffries, NO .
BMC BIOINFORMATICS, 2004, 5 (1)
[7]   Feature selection and nearest centroid classification for protein mass spectrometry [J].
Levner, I .
BMC BIOINFORMATICS, 2005, 6 (1)
[8]   Probabilistic disease classification of expression-dependent proteomic data from mass spectrometry of human serum [J].
Lilien, RH ;
Farid, H ;
Donald, BR .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (06) :925-946
[9]   A THEORY FOR MULTIRESOLUTION SIGNAL DECOMPOSITION - THE WAVELET REPRESENTATION [J].
MALLAT, SG .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1989, 11 (07) :674-693
[10]   Genomics and proteomics: application of novel technology to early detection and prevention of cancer [J].
Michener, CM ;
Ardekani, AM ;
Petricoin, EF ;
Liotta, LA ;
Kohn, EC .
CANCER DETECTION AND PREVENTION, 2002, 26 (04) :249-255