Sequential Model Selection-Based Segmentation to Detect DNA Copy Number Variation

被引:2
作者
Hu, Jianhua [1 ]
Zhang, Liwen [2 ]
Wang, Huixia Judy [3 ]
机构
[1] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
[2] Shanghai Univ, Sch Econ, Shanghai 200444, Peoples R China
[3] George Washington Univ, Dept Stat, Washington, DC 20052 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Array-based CGH; Bayesian information criterion; Copy-number variation; Segmentation; Sequential model selection; RESOLUTION GENOMIC PROFILES; ARRAY CGH DATA; IDENTIFICATION; ABERRATIONS; DISCOVERY; FRAMEWORK; REGIONS; GENES;
D O I
10.1111/biom.12478
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Array-based CGH experiments are designed to detect genomic aberrations or regions of DNA copy-number variation that are associated with an outcome, typically a state of disease. Most of the existing statistical methods target on detecting DNA copy number variations in a single sample or array. We focus on the detection of group effect variation, through simultaneous study of multiple samples from multiple groups. Rather than using direct segmentation or smoothing techniques, as commonly seen in existing detection methods, we develop a sequential model selection procedure that is guided by a modified Bayesian information criterion. This approach improves detection accuracy by accumulatively utilizing information across contiguous clones, and has computational advantage over the existing popular detection methods. Our empirical investigation suggests that the performance of the proposed method is superior to that of the existing detection methods, in particular, in detecting small segments or separating neighboring segments with differential degrees of copy-number variation.
引用
收藏
页码:815 / 826
页数:12
相关论文
共 42 条
  • [1] Personalized identification of altered pathways in cancer using accumulated normal tissue data
    Ahn, TaeJin
    Lee, Eunjin
    Huh, Nam
    Park, Taesung
    [J]. BIOINFORMATICS, 2014, 30 (17) : I422 - I429
  • [2] Ben-Dor A, 2007, LECT NOTES COMPUT SC, V4453, P122
  • [3] A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
    Bolstad, BM
    Irizarry, RA
    Åstrand, M
    Speed, TP
    [J]. BIOINFORMATICS, 2003, 19 (02) : 185 - 193
  • [4] High-resolution genomic profiles define distinct clinico-pathogenetic subgroups of multiple myeloma patients
    Carrasco, DR
    Tonon, G
    Huang, YS
    Zhang, YY
    Sinha, R
    Bin, F
    Stewart, JP
    Zhan, FG
    Khatry, D
    Protopopova, M
    Protopopov, A
    Sukhdeo, K
    Hanamura, I
    Stephens, O
    Barlogie, B
    Anderson, KC
    Chin, L
    Shaughnessy, JD
    Brennan, C
    DePinho, RA
    [J]. CANCER CELL, 2006, 9 (04) : 313 - 325
  • [5] Chengping C, 2008, PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON RISK MANAGEMENT & ENGINEERING MANAGEMENT, VOLS 1 AND 2, P17
  • [6] The Complete Genome Sequence of the Plant Growth-Promoting Bacterium Pseudomonas sp UW4
    Duan, Jin
    Jiang, Wei
    Cheng, Zhenyu
    Heikkila, John J.
    Glick, Bernard R.
    [J]. PLOS ONE, 2013, 8 (03):
  • [7] False discovery rates and copy number variation
    Efron, Bradley
    Zhang, Nancy R.
    [J]. BIOMETRIKA, 2011, 98 (02) : 251 - 271
  • [8] Transcriptional features of multiple myeloma patients with chromosome 1q gain
    Fabris, S.
    Ronchetti, D.
    Agnelli, L.
    Baldini, L.
    Morabito, F.
    Bicciato, S.
    Basso, D.
    Todoerti, K.
    Lombardi, L.
    Lambertenghi-Deliliers, G.
    Neri, A.
    [J]. LEUKEMIA, 2007, 21 (05) : 1113 - 1116
  • [9] Bayesian hidden Markov Modeling of array CGH data
    Guha, Subharup
    Li, Yi
    Neuberg, Donna
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (482) : 485 - 497
  • [10] Robust smooth segmentation approach for array CGH data analysis
    Huang, Jian
    Gusnanto, Arief
    O'Sullivan, Kathleen
    Staaf, Johan
    Borg, AKe
    Pawitan, Yudi
    [J]. BIOINFORMATICS, 2007, 23 (18) : 2463 - 2469