Support vector machine based classification of fast Fourier transform spectroscopy of proteins

被引:5
作者
Lazarevic, Aleksandar [1 ]
Pokrajac, Dragoljub [1 ]
Marcano, Aristides [1 ]
Melikechi, Noureddine [1 ]
机构
[1] Delaware State Univ, Dept Phys & Preengn, Ctr Res & Educ Opt Sci & Applicat, Dover, DE 19901 USA
来源
ADVANCED BIOMEDICAL AND CLINICAL DIAGNOSTIC SYSTEMS VII | 2009年 / 7169卷
关键词
Fourier Transform Infrared Spectroscopy; infrared spectra of proteins; principal component analysis; support vector machine; classification;
D O I
10.1117/12.809964
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
Fast Fourier transform spectroscopy has proved to be a powerful method for study of the secondary structure of proteins since peak positions and their relative amplitude are affected by the number of hydrogen bridges that sustain this secondary structure. However, to our best knowledge, the method has not been used yet for identification of proteins within a complex matrix like a blood sample. The principal reason is the apparent similarity of protein infrared spectra with actual differences usually masked by the solvent contribution and other interactions. In this paper, we propose a novel machine learning based method that uses protein spectra for classification and identification of such proteins within a given sample. The proposed method uses principal component analysis (PCA) to identify most important linear combinations of original spectral components and then employs support vector machine (SVM) classification model applied on such identified combinations to categorize proteins into one of given groups. Our experiments have been performed on the set of four different proteins, namely: Bovine Serum Albumin, Leptin, Insulin-like Growth Factor 2 and Osteopontin. Our proposed method of applying principal component analysis along with support vector machines exhibits excellent classification accuracy when identifying proteins using their infrared spectra.
引用
收藏
页数:8
相关论文
共 21 条
[1]   EXAMINATION OF THE SECONDARY STRUCTURE OF PROTEINS BY DECONVOLVED FTIR SPECTRA [J].
BYLER, DM ;
SUSI, H .
BIOPOLYMERS, 1986, 25 (03) :469-487
[2]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[3]  
Fan RE, 2005, J MACH LEARN RES, V6, P1889
[4]   A comparison of methods for multiclass support vector machines [J].
Hsu, CW ;
Lin, CJ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (02) :415-425
[5]   Laser-induced breakdown spectroscopy detection and classification of biological aerosols [J].
Hybl, JD ;
Lithgow, GA ;
Buckley, SG .
APPLIED SPECTROSCOPY, 2003, 57 (10) :1207-1215
[6]  
JIANG ZQ, 2005, J ZHEJIANG U SCIENCE
[7]  
Jolliffe IT., 2002, Principal component analysis, P338, DOI DOI 10.1039/C3AY41907J
[8]  
Karush W., 1939, Minima of Functions of Several Variables with Inequalities as Side Constraints, P319
[9]  
KONRAD J, 1982, LINEAR INTEGRAL OPER
[10]  
KUNIHIRO K, 1984, BIOPOLYMERS, V22, P59