Bag of Peaks: interpretation of NMR spectrometry

被引:9
作者
Brelstaff, Gavin [4 ]
Bicego, Manuele [1 ]
Culeddu, Nicola [2 ]
Chessa, Matilde [3 ]
机构
[1] Univ Sassari, DEIR, I-07100 Sassari, Italy
[2] CNR, ICB, I-07040 Sassari, Italy
[3] Porto Conte Ric, Loc Tramariglio, Alghero, Italy
[4] CRS4, Biocomp, I-09100 Sardinia, Italy
关键词
CLASSIFICATION;
D O I
10.1093/bioinformatics/btn599
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The analysis of high-resolution proton nuclear magnetic resonance ( NMR) spectrometry can assist human experts to implicate metabolites expressed by diseased biofluids. Here, we explore an intermediate representation, between spectral trace and classifier, able to furnish a communicative interface between expert and machine. This representation permits equivalent, or better, classification accuracies than either principal component analysis ( PCA) or multi-dimensional scaling ( MDS). In the training phase, the peaks in each trace are detected and clustered in order to compile a common dictionary, which could be visualized and adjusted by an expert. The dictionary is used to characterize each trace with a fixed-length feature vector, termed Bag of Peaks, ready to be classified with classical supervised methods. Results: Our small-scale study, concerning Type I diabetes in Sardinian children, provides a preliminary indication of the effectiveness of the Bag of Peaks approach over standard PCA and MDS. Consistently, higher classification accuracies are obtained once a sufficient number of peaks (> 10) are included in the dictionary. A large-scale simulation of noisy spectra further confirms this advantage. Finally, suggestions for metabolite-peak loci that may be implicated in the disease are obtained by applying standard feature selection techniques.
引用
收藏
页码:258 / 264
页数:7
相关论文
共 32 条
[1]  
[Anonymous], 1990, Principles of Nuclear Magnetic Resonance in One and Two Dimensions
[2]  
[Anonymous], 1994, Multidimensional Scaling
[3]  
[Anonymous], 2001, ADV NEURAL INFORM PR
[4]  
[Anonymous], STAT METHODS DIGITAL
[5]  
Bishop Christopher M, 1995, Neural networks for pattern recognition
[6]   Exponential parameter estimation (in NMR) using Bayesian probability theory [J].
Bretthorst, GL ;
Hutton, WC ;
Garbow, JR ;
Ackerman, JJH .
CONCEPTS IN MAGNETIC RESONANCE PART A, 2005, 27A (02) :55-63
[7]   SCREE TEST FOR NUMBER OF FACTORS [J].
CATTELL, RB .
MULTIVARIATE BEHAVIORAL RESEARCH, 1966, 1 (02) :245-276
[8]   Latent semantic kernels [J].
Cristianini, N ;
Shawe-Taylor, J ;
Lodhi, H .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2002, 18 (2-3) :127-152
[9]  
CSURKA G, 2004, P WORKSH PATT REC MA
[10]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227