Bayesian hidden Markov Modeling of array CGH data

被引:53
作者
Guha, Subharup [1 ]
Li, Yi [2 ]
Neuberg, Donna [2 ]
机构
[1] Univ Missouri, Dept Stat, Columbia, MO 65211 USA
[2] Harvard Univ, Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
关键词
amplifications; cancer; copy number; deletions; DNA; genomic alterations; intensity ratios; MCMC; tumor;
D O I
10.1198/016214507000000923
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Genomic alterations have been linked to the development and progression of cancer. The technique of comparative genomic hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Because the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme, and breast cancer are analyzed, and comparisons are made with some widely used algorithms to illustrate the reliability and success of the technique.
引用
收藏
页码:485 / 497
页数:13
相关论文
共 50 条
  • [1] High-resolution characterization of the pancreatic adenocarcinoma genome
    Aguirre, AJ
    Brennan, C
    Bailey, G
    Sinha, R
    Feng, B
    Leo, C
    Zhang, YY
    Zhang, J
    Gans, JD
    Bardeesy, N
    Cauwels, C
    Cordon-Cardo, C
    Redston, MS
    DePinho, RA
    Chin, L
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (24) : 9067 - 9072
  • [2] MOST HUMAN CARCINOMAS OF THE EXOCRINE PANCREAS CONTAIN MUTANT C-K-RAS GENES
    ALMOGUERA, C
    SHIBATA, D
    FORRESTER, K
    MARTIN, J
    ARNHEIM, N
    PERUCHO, M
    [J]. CELL, 1988, 53 (04) : 549 - 554
  • [3] Pancreatic cancer biology and genetics
    Bardeesy, N
    DePinho, RA
    [J]. NATURE REVIEWS CANCER, 2002, 2 (12) : 897 - 909
  • [4] High-resolution genome-wide mapping of genetic alterations in human glial brain tumors
    Bredel, M
    Bredel, C
    Juric, D
    Harsh, GR
    Vogel, H
    Recht, LD
    Sikic, BI
    [J]. CANCER RESEARCH, 2005, 65 (10) : 4088 - 4096
  • [5] High-resolution global profiling of genomic alterations with long oligonucleotide microarray
    Brennan, C
    Zhang, YY
    Leo, C
    Feng, B
    Cauwels, C
    Aguirre, AJ
    Kim, MJ
    Protopopov, A
    Chin, L
    [J]. CANCER RESEARCH, 2004, 64 (14) : 4744 - 4748
  • [6] Image metrics in the statistical analysis of DNA microarray data
    Brown, CS
    Goodwin, PC
    Sorger, PK
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (16) : 8944 - 8949
  • [7] FREQUENT SOMATIC MUTATIONS AND HOMOZYGOUS DELETIONS OF THE P16 (MTS1) GENE IN PANCREATIC ADENOCARCINOMA
    CALDAS, C
    HAHN, SA
    DACOSTA, LT
    REDSTON, MS
    SCHUTTE, M
    SEYMOUR, AB
    WEINSTEIN, CL
    HRUBAN, RH
    YEO, CJ
    KERN, SE
    [J]. NATURE GENETICS, 1994, 8 (01) : 27 - 32
  • [8] Array rank order regression analysis for the detection of gene copy-number changes in human cancer
    Cheng, C
    Kimmel, R
    Neiman, P
    Zhao, LP
    [J]. GENOMICS, 2003, 82 (02) : 122 - 129
  • [9] Calculating posterior distributions and modal estimates in Markov mixture models
    Chib, S
    [J]. JOURNAL OF ECONOMETRICS, 1996, 75 (01) : 79 - 97
  • [10] Durbin R., 1998, BIOL SEQUENCE ANAL