Sequential Model Selection-Based Segmentation to Detect DNA Copy Number Variation

被引：2

作者：

Hu, Jianhua ^{[1
]}

Zhang, Liwen ^{[2
]}

Wang, Huixia Judy ^{[3
]}

机构：

[1] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA

[2] Shanghai Univ, Sch Econ, Shanghai 200444, Peoples R China

[3] George Washington Univ, Dept Stat, Washington, DC 20052 USA

来源：

BIOMETRICS | 2016年 / 72卷 / 03期

基金：

美国国家科学基金会; 美国国家卫生研究院;

关键词：

Array-based CGH; Bayesian information criterion; Copy-number variation; Segmentation; Sequential model selection; RESOLUTION GENOMIC PROFILES; ARRAY CGH DATA; IDENTIFICATION; ABERRATIONS; DISCOVERY; FRAMEWORK; REGIONS; GENES;

D O I：

10.1111/biom.12478

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Array-based CGH experiments are designed to detect genomic aberrations or regions of DNA copy-number variation that are associated with an outcome, typically a state of disease. Most of the existing statistical methods target on detecting DNA copy number variations in a single sample or array. We focus on the detection of group effect variation, through simultaneous study of multiple samples from multiple groups. Rather than using direct segmentation or smoothing techniques, as commonly seen in existing detection methods, we develop a sequential model selection procedure that is guided by a modified Bayesian information criterion. This approach improves detection accuracy by accumulatively utilizing information across contiguous clones, and has computational advantage over the existing popular detection methods. Our empirical investigation suggests that the performance of the proposed method is superior to that of the existing detection methods, in particular, in detecting small segments or separating neighboring segments with differential degrees of copy-number variation.

引用

页码：815 / 826

页数：12

共 42 条

[1] Personalized identification of altered pathways in cancer using accumulated normal tissue data
Ahn, TaeJin
Lee, Eunjin
Huh, Nam
Park, Taesung
[J]. BIOINFORMATICS, 2014, 30 (17) : I422 - I429
[2] Ben-Dor A, 2007, LECT NOTES COMPUT SC, V4453, P122
[3] A comparison of normalization methods for high density oligonucleotide array data based on variance and bias
Bolstad, BM
Irizarry, RA
Åstrand, M
Speed, TP
[J]. BIOINFORMATICS, 2003, 19 (02) : 185 - 193
[4] High-resolution genomic profiles define distinct clinico-pathogenetic subgroups of multiple myeloma patients
Carrasco, DR
Tonon, G
Huang, YS
Zhang, YY
Sinha, R
Bin, F
Stewart, JP
Zhan, FG
Khatry, D
Protopopova, M
Protopopov, A
Sukhdeo, K
Hanamura, I
Stephens, O
Barlogie, B
Anderson, KC
Chin, L
Shaughnessy, JD
Brennan, C
DePinho, RA
[J]. CANCER CELL, 2006, 9 (04) : 313 - 325
[5] Chengping C, 2008, PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON RISK MANAGEMENT & ENGINEERING MANAGEMENT, VOLS 1 AND 2, P17
[6] The Complete Genome Sequence of the Plant Growth-Promoting Bacterium Pseudomonas sp UW4
Duan, Jin
Jiang, Wei
Cheng, Zhenyu
Heikkila, John J.
Glick, Bernard R.
[J]. PLOS ONE, 2013, 8 (03):
[7] False discovery rates and copy number variation
Efron, Bradley
Zhang, Nancy R.
[J]. BIOMETRIKA, 2011, 98 (02) : 251 - 271
[8] Transcriptional features of multiple myeloma patients with chromosome 1q gain
Fabris, S.
Ronchetti, D.
Agnelli, L.
Baldini, L.
Morabito, F.
Bicciato, S.
Basso, D.
Todoerti, K.
Lombardi, L.
Lambertenghi-Deliliers, G.
Neri, A.
[J]. LEUKEMIA, 2007, 21 (05) : 1113 - 1116
[9] Bayesian hidden Markov Modeling of array CGH data
Guha, Subharup
Li, Yi
Neuberg, Donna
[J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (482) : 485 - 497
[10] Robust smooth segmentation approach for array CGH data analysis
Huang, Jian
Gusnanto, Arief
O'Sullivan, Kathleen
Staaf, Johan
Borg, AKe
Pawitan, Yudi
[J]. BIOINFORMATICS, 2007, 23 (18) : 2463 - 2469

← 1 2 3 4 5 →