A backward procedure for change-point detection with applications to copy number variation detection

被引:7
作者
Jun Shin, Seung [1 ]
Wu, Yichao [2 ]
Hao, Ning [3 ]
机构
[1] Korea Univ, Dept Stat, Seoul, South Korea
[2] Univ Illinois, Dept Math Stat & Comp Sci, Chicago, IL 60680 USA
[3] Univ Arizona, Dept Math, Tucson, AZ 85721 USA
来源
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE | 2020年 / 48卷 / 03期
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
Backward detection; copy number variation; mean change-point model; multiple change points; Short signal; CIRCULAR BINARY SEGMENTATION; STRUCTURAL VARIATION; ALGORITHM; IDENTIFICATION; ASSOCIATION; GENOTYPE; COMMON; TESTS;
D O I
10.1002/cjs.11535
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Change-point detection regains much attention recently for analyzing array or sequencing data for copy number variation (CNV) detection. In such applications, the true signals are typically very short and buried in the long data sequence, which makes it challenging to identify the variations efficiently and accurately. In this article, we propose a new change-point detection method, a backward procedure, which is not only fast and simple enough to exploit high-dimensional data but also performs very well for detecting short signals. Although motivated by CNV detection, the backward procedure is generally applicable to assorted change-point problems that arise in a variety of scientific applications. It is illustrated by both simulated and real CNV data that the backward detection has clear advantages over other competing methods, especially when the true signal is short. The Canadian Journal of Statistics; 2020 (c) 2020 Statistical Society of Canada
引用
收藏
页码:366 / 385
页数:20
相关论文
共 50 条
[1]   CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing [J].
Abyzov, Alexej ;
Urban, Alexander E. ;
Snyder, Michael ;
Gerstein, Mark .
GENOME RESEARCH, 2011, 21 (06) :974-984
[2]   Integrating common and rare genetic variation in diverse human populations [J].
Altshuler, David M. ;
Gibbs, Richard A. ;
Peltonen, Leena ;
Dermitzakis, Emmanouil ;
Schaffner, Stephen F. ;
Yu, Fuli ;
Bonnen, Penelope E. ;
de Bakker, Paul I. W. ;
Deloukas, Panos ;
Gabriel, Stacey B. ;
Gwilliam, Rhian ;
Hunt, Sarah ;
Inouye, Michael ;
Jia, Xiaoming ;
Palotie, Aarno ;
Parkin, Melissa ;
Whittaker, Pamela ;
Chang, Kyle ;
Hawes, Alicia ;
Lewis, Lora R. ;
Ren, Yanru ;
Wheeler, David ;
Muzny, Donna Marie ;
Barnes, Chris ;
Darvishi, Katayoon ;
Hurles, Matthew ;
Korn, Joshua M. ;
Kristiansson, Kati ;
Lee, Charles ;
McCarroll, Steven A. ;
Nemesh, James ;
Keinan, Alon ;
Montgomery, Stephen B. ;
Pollack, Samuela ;
Price, Alkes L. ;
Soranzo, Nicole ;
Gonzaga-Jauregui, Claudia ;
Anttila, Verneri ;
Brodeur, Wendy ;
Daly, Mark J. ;
Leslie, Stephen ;
McVean, Gil ;
Moutsianas, Loukas ;
Nguyen, Huy ;
Zhang, Qingrun ;
Ghori, Mohammed J. R. ;
McGinnis, Ralph ;
McLaren, William ;
Takeuchi, Fumihiko ;
Grossman, Sharon R. .
NATURE, 2010, 467 (7311) :52-58
[3]   A robust statistical method for case-control association testing with copy number variation [J].
Barnes, Chris ;
Plagnol, Vincent ;
Fitzgerald, Tomas ;
Redon, Richard ;
Marchini, Jonathan ;
Clayton, David ;
Hurles, Matthew E. .
NATURE GENETICS, 2008, 40 (10) :1245-1252
[4]   Multiple changepoint fitting via quasilikelihood, with application to DNA sequence segmentation [J].
Braun, JV ;
Braun, RK ;
Müller, HG .
BIOMETRIKA, 2000, 87 (02) :301-314
[5]   Genome-Wide Analyses of Exonic Copy Number Variants in a Family-Based Study Point to Novel Autism Susceptibility Genes [J].
Bucan, Maja ;
Abrahams, Brett S. ;
Wang, Kai ;
Glessner, Joseph T. ;
Herman, Edward I. ;
Sonnenblick, Lisa I. ;
Retuerto, Ana I. Alvarez ;
Imielinski, Marcin ;
Hadley, Dexter ;
Bradfield, Jonathan P. ;
Kim, Cecilia ;
Gidaya, Nicole B. ;
Lindquist, Ingrid ;
Hutman, Ted ;
Sigman, Marian ;
Kustanovich, Vlad ;
Lajonchere, Clara M. ;
Singleton, Andrew ;
Kim, Junhyong ;
Wassink, Thomas H. ;
McMahon, William M. ;
Owley, Thomas ;
Sweeney, John A. ;
Coon, Hilary ;
Nurnberger, John I., Jr. ;
Li, Mingyao ;
Cantor, Rita M. ;
Minshew, Nancy J. ;
Sutcliffe, James S. ;
Cook, Edwin H. ;
Dawson, Geraldine ;
Buxbaum, Joseph D. ;
Grant, Struan F. A. ;
Schellenberg, Gerard D. ;
Geschwind, Daniel H. ;
Hakonarson, Hakon .
PLOS GENETICS, 2009, 5 (06)
[6]   Robust detection and identification of sparse segments in ultrahigh dimensional data analysis [J].
Cai, T. Tony ;
Jeng, X. Jessie ;
Li, Hongzhe .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2012, 74 :773-797
[7]   Methods and strategies for analyzing copy number variation using DNA microarrays [J].
Carter, Nigel P. .
NATURE GENETICS, 2007, 39 (Suppl 7) :S16-S21
[8]   ALLELE-SPECIFIC COPY NUMBER ESTIMATION BY WHOLE EXOME SEQUENCING [J].
Chen, Hao ;
Jiang, Yuchao ;
Maxwell, Kara N. ;
Nathanson, Katherine L. ;
Zhang, Nancy .
ANNALS OF APPLIED STATISTICS, 2017, 11 (02) :1169-1192
[9]   GRAPH-BASED CHANGE-POINT DETECTION [J].
Chen, Hao ;
Zhang, Nancy .
ANNALS OF STATISTICS, 2015, 43 (01) :139-176
[10]   ESTIMATING CURRENT MEAN OF NORMAL-DISTRIBUTION WHICH IS SUBJECTED TO CHANGES IN TIME [J].
CHERNOFF, H ;
ZACKS, S .
ANNALS OF MATHEMATICAL STATISTICS, 1964, 35 (03) :999-&