Identifying variables responsible for clustering in discriminant analysis of data from infrared microspectroscopy of a biological sample

被引:112
作者
Martin, Francis L. [1 ]
German, Matthew J.
Wit, Ernst
Fearn, Thomas
Ragavan, Narasimhan
Pollock, Hubert M.
机构
[1] Univ Lancaster, Dept Phys, Lancaster LA1 4YB, England
[2] Univ Lancaster, Biomed Sci Unit, Lancaster, England
[3] Univ Newcastle Upon Tyne, Sch Dent Sci, Newcastle Upon Tyne NE1 7RU, Tyne & Wear, England
[4] Univ Lancaster, Dept Math & Stat, Stat Bioinformat Grp, Lancaster, England
[5] UCL, Dept Stat Sci, London, England
[6] Lancashire Teaching Hosp NHS Trust, Preston, Lancs, England
关键词
adenocarcinoma; biomedical; clustering; LDA; microspectroscopy; misclassification;
D O I
10.1089/cmb.2007.0057
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In the biomedical field, infrared (IR) spectroscopic studies can involve the processing of data derived from many samples, divided into classes such as category of tissue ( e. g., normal or cancerous) or patient identity. We require reliable methods to identify the class-specific information on which of the wavenumbers, representing various molecular groups, are responsible for observed class groupings. Employing a prostate tissue sample divided into three regions ( transition zone, peripheral zone, and adjacent adenocarcinoma), and interrogated using synchrotron Fourier-transform IR microspectroscopy, we compared two statistical methods: ( a) a new "cluster vector" version of principal component analysis (PCA) in which the dimensions of the dataset are reduced, followed by linear discriminant analysis (LDA) to reveal clusters, through each of which a vector is constructed that identifies the contributory wavenumbers; and (b) stepwise LDA, which exploits the fact that spectral peaks which identify certain chemical bonds extend over several wavenumbers, and which following classification via either one or two wavenumbers, checks whether the resulting predictions are stable across a range of nearby wavenumbers. Stepwise LDA is the simpler of the two methods; the cluster vector approach can indicate which of the different classes of spectra exhibit the significant differences in signal seen at the "prominent" wavenumbers identified. In situations where IR spectra are found to separate into classes, the excellent agreement between the two quite different methods points to what will prove to be a new and reliable approach to establishing which molecular groups are responsible for such separation.
引用
收藏
页码:1176 / 1184
页数:9
相关论文
共 4 条
[1]  
Fearn T., 2002, HDB VIBRATIONAL SPEC, P2086
[2]   Infrared spectroscopy with multivariate analysis potentially facilitates the segregation of different types of prostate cell [J].
German, MJ ;
Hammiche, A ;
Ragavan, N ;
Tobin, MJ ;
Cooper, LJ ;
Matanhelia, SS ;
Hindley, AC ;
Nicholson, CM ;
Fullwood, NJ ;
Pollock, HM ;
Martin, FL .
BIOPHYSICAL JOURNAL, 2006, 90 (10) :3783-3795
[3]   ATR microspectroscopy with multivariate analysis segregates grades of exfoliative cervical cytology [J].
Walsh, Michael J. ;
Singh, Maneesh N. ;
Pollock, Hubert M. ;
Cooper, Leanne J. ;
German, Matthew J. ;
Stringfellow, Helen F. ;
Fullwood, Nigel J. ;
Paraskevaidis, Evangelos ;
Martin-Hirsch, Pierre L. ;
Martin, Francis L. .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2007, 352 (01) :213-219
[4]  
Wit E., 2004, STAT MICROARRAYS DES