Pretreatment of mass spectral profiles: Application to proteomic data

被引:28
作者
Arneberg, Reidar
Rajalahti, Tarja
Flikka, Kristian
Berven, Frode S.
Kroksveen, Ann C.
Berle, Magnus
Myhr, Kjell-Morten
Vedeler, Christian A.
Ulvik, Rune J.
Kvalheim, Olav M. [1 ]
机构
[1] Univ Bergen, Ctr Integrated Petroleum Res, Bergen, Norway
[2] Univ Bergen, Dept Clin Med, Bergen, Norway
[3] Univ Bergen, Proteom Unit, Bergen, Norway
[4] Univ Bergen, Dept Informat, N-5008 Bergen, Norway
[5] Univ Bergen, Inst Med, Bergen, Norway
[6] Univ Bergen, Inst Mol Biol, Bergen, Norway
[7] Univ Bergen, Dept Chem, Bergen, Norway
[8] Pattern Recognit Syst AS, Bergen, Norway
[9] Haukeland Hosp, Dept Neurol, N-5021 Bergen, Norway
[10] Haukeland Hosp, Natl Competence Ctr Multiple Sclerosis, N-5021 Bergen, Norway
[11] Haukeland Hosp, Lab Clin Biochem, N-5021 Bergen, Norway
[12] Bergen Ctr Computat Sci, Computat Biol Unit, Bergen, Norway
关键词
D O I
10.1021/ac070946s
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Mass spectral profiles are influenced by several factors that have no relation to compositional differences between samples: baseline effects, shifts in mass-to-charge ratio (m/z) (synchronization/alignment problem), structured noise (heteroscedasticity), and, differences in signal intensities (normalization problem). Different procedures for pretreatment of whole mass spectral profiles described by almost 50 000 m/z values are investigated in order to find optimal approaches with respect to revealing the information content in the data. In order to quantitatively assess the impact of different procedures for pretreatment of mass spectral profiles, we use factorial designs with the ratio between intergroup and intragroup (replicate) variance as response. We have examined the influence of smoothing, binning, alignment/synchronization, noise pattern, and normalization on data interpretation. Our analysis shows that the spectral profiles have to be corrected for heteroscedastic noise prior to normalization. An nth root transform, where n is a small, positive integer, is used to create a homoscedastic noise structure without destroying the linear correlation structures describing individual components when using whole mass spectral profiles. The choice of n is decided by a simple graphic procedure using replicate information. Log transform is shown to change the heteroscedastic noise structure from being dominant in high-intensity regions, to produce the largest noise in the low-intensity regions. In addition, log transform has a negative effect on the collinearity in the profiles. Factorial designs reveal strong interactions between several of the pretreatment steps, e.g., noise structure and normalization. This underlines the limited usability of looking at the different pretreatment steps in isolation. Binning turns out to be able to substitute smoothing of spectra by, for example, moving average or Savitsky- Golay, while, at the same time, reducing the data point description of the profiles by 1 order of magnitude. Thus, if the sampling density is high, binning seems to be an attractive option for data reduction without the risk of losing information accompanying the integration of profiles into peaks. In the absence of smoothing, binning should be executed prior to alignment. If binning is not performed, the order of pretreatment should be smoothing, alignment, nth root transform, and normalization.
引用
收藏
页码:7014 / 7026
页数:13
相关论文
共 25 条
[1]   SIMPLEX FOCUSING OF RETENTION TIMES AND LATENT VARIABLE PROJECTIONS OF CHROMATOGRAPHIC PROFILES [J].
ANDERSSON, R ;
HAMALAINEN, MD .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1994, 22 (01) :49-61
[2]   A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples [J].
Baggerly, KA ;
Morris, JS ;
Wang, J ;
Gold, D ;
Xiao, LC ;
Coombes, KR .
PROTEOMICS, 2003, 3 (09) :1667-1672
[3]  
BERVEN FS, IN PRESS J PROTEOMIC
[4]  
Conrad TOF, 2006, LECT NOTES COMPUT SC, V4216, P119
[5]   DECONVOLUTION IN ONE-DIMENSIONAL CHROMATOGRAPHY BY HEURISTIC EVOLVING LATENT PROJECTIONS OF WHOLE PROFILES RETENTION TIME SHIFTED BY SIMPLEX OPTIMIZATION OF CROSS-CORRELATION BETWEEN TARGET PEAKS [J].
HAMALAINEN, MD ;
LIANG, YZ ;
KVALHEIM, OM ;
ANDERSSON, R .
ANALYTICA CHIMICA ACTA, 1993, 271 (01) :101-114
[6]  
*JAS WH WONG CARTW, SPEC DOC VERS 2 3
[7]  
JOHANSSON E, 1984, ANAL CHEM, V56, P1685
[8]   LASER DESORPTION IONIZATION OF PROTEINS WITH MOLECULAR MASSES EXCEEDING 10000 DALTONS [J].
KARAS, M ;
HILLENKAMP, F .
ANALYTICAL CHEMISTRY, 1988, 60 (20) :2299-2301
[9]   MULTIVARIATE CALIBRATION OF AN X-RAY DIFFRACTOMETER BY PARTIAL LEAST-SQUARES REGRESSION [J].
KARSTANG, TV ;
EASTGATE, RJ .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1987, 2 (1-3) :209-219
[10]   Characterization of serum biomarkers for detection of early stage ovarian cancer [J].
Kozak, KR ;
Su, F ;
Whitelegge, JP ;
Faull, K ;
Reddy, S ;
Farias-Eisner, R .
PROTEOMICS, 2005, 5 (17) :4589-4596