Using chemometrics for navigating in the large data sets of genomics, proteomics, and metabonomics (gpm)

被引:205
作者
Eriksson, L
Antti, H
Gottfries, J
Holmes, E
Johansson, E
Lindgren, F
Long, I
Lundstedt, T
Trygg, J
Wold, S
机构
[1] Umetr AB, S-90719 Umea, Sweden
[2] Univ London Imperial Coll Sci Technol & Med, Fac Med, Div Biomed Sci, London SW7 2AZ, England
[3] AstraZeneca, R&D Molndal, S-43183 Molndal, Sweden
[4] Umea Univ, Inst Chem, S-90187 Umea, Sweden
[5] Umetr AB, Malmo Off, S-21134 Malmo, Sweden
[6] Uppsala Univ, Dept Pharmaceut Chem, S-74123 Uppsala, Sweden
关键词
PCA; PLS; hierarchical modeling; multivariate analysis; omics data analysis;
D O I
10.1007/s00216-004-2783-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
This article describes the applicability of multivariate projection techniques, such as principal-component analysis (PCA) and partial least-squares (PLS) projections to latent structures, to the large-volume high-density data structures obtained within genomics, proteomics, and metabonomics. PCA and PLS, and their extensions, derive their usefulness from their ability to analyze data with many, noisy, collinear, and even incomplete variables in both X and Y. Three examples are used as illustrations: the first example is a genomics data set and involves modeling of microarray data of cell cycle-regulated genes in the microorganism Saccharomyces cerevisiae. The second example contains NMR-metabonomics data, measured on urine samples of male rats treated with either of the drugs chloroquine or amiodarone. The third and last data set describes sequence-function classification studies in a set of G-protein-coupled receptors using hierarchical PCA.
引用
收藏
页码:419 / 429
页数:11
相关论文
共 38 条
  • [1] Batch statistical processing of 1H NMR-derived urinary spectral data
    Antti, H
    Bollard, ME
    Ebbels, T
    Keun, H
    Lindon, JC
    Nicholson, JK
    Holmes, E
    [J]. JOURNAL OF CHEMOMETRICS, 2002, 16 (8-10) : 461 - 468
  • [2] Partial least squares for discrimination
    Barker, M
    Rayens, W
    [J]. JOURNAL OF CHEMOMETRICS, 2003, 17 (03) : 166 - 173
  • [3] Berglund A, 1997, J CHEMOMETR, V11, P141, DOI 10.1002/(SICI)1099-128X(199703)11:2<141::AID-CEM461>3.0.CO
  • [4] 2-2
  • [5] Alignment of flexible molecules at their receptor site using 3D descriptors and Hi-PCA
    Berglund, A
    De Rosa, MC
    Wold, S
    [J]. JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 1997, 11 (06) : 601 - 612
  • [6] Burnham AJ, 1996, J CHEMOMETR, V10, P31, DOI 10.1002/(SICI)1099-128X(199601)10:1<31::AID-CEM398>3.0.CO
  • [7] 2-1
  • [8] Latent variable multivariate regression modeling
    Burnham, AJ
    MacGregor, JF
    Viveros, R
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1999, 48 (02) : 167 - 180
  • [9] A genome-wide transcriptional analysis of the mitotic cell cycle
    Cho, RJ
    Campbell, MJ
    Winzeler, EA
    Steinmetz, L
    Conway, A
    Wodicka, L
    Wolfsberg, TG
    Gabrielian, AE
    Landsman, D
    Lockhart, DJ
    Davis, RW
    [J]. MOLECULAR CELL, 1998, 2 (01) : 65 - 73
  • [10] Eriksson L, 2000, QUANT STRUCT-ACT REL, V19, P345, DOI 10.1002/1521-3838(200010)19:4<345::AID-QSAR345>3.0.CO