Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion

被引:169
|
作者
Xi, Ruibin [1 ]
Hadjipanayis, Angela G. [2 ]
Luquette, Lovelace J. [1 ]
Kim, Tae-Min [1 ]
Lee, Eunjung [1 ,2 ]
Zhang, Jianhua [3 ]
Johnson, Mark D. [4 ]
Muzny, Donna M. [5 ]
Wheeler, David A. [5 ]
Gibbs, Richard A. [5 ]
Kucherlapati, Raju [2 ,6 ]
Park, Peter J. [1 ,2 ,7 ]
机构
[1] Harvard Univ, Sch Med, Ctr Biomed Informat, Boston, MA 02115 USA
[2] Brigham & Womens Hosp, Div Genet, Boston, MA 02115 USA
[3] Dana Farber Canc Inst, Belfer Inst Appl Canc Sci, Boston, MA 02115 USA
[4] Brigham & Womens Hosp, Dept Neurol Surg, Boston, MA 02115 USA
[5] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[6] Harvard Univ, Sch Med, Dept Genet, Boston, MA 02115 USA
[7] Childrens Hosp, Informat Program, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
structural variation; genomic alterations; model selection; semiparametric model; RARE CHROMOSOMAL DELETIONS; MYELOID-LEUKEMIA GENOME; TUMOR-SUPPRESSOR GENE; STRUCTURAL VARIATION; POPULATION-SCALE; CANCER; GLIOBLASTOMA; ARRAY; SCHIZOPHRENIA; ALGORITHMS;
D O I
10.1073/pnas.1110574108
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
DNA copy number variations (CNVs) play an important role in the pathogenesis and progression of cancer and confer susceptibility to a variety of human disorders. Array comparative genomic hybridization has been used widely to identify CNVs genome wide, but the next-generation sequencing technology provides an opportunity to characterize CNVs genome wide with unprecedented resolution. In this study, we developed an algorithm to detect CNVs from whole-genome sequencing data and applied it to a newly sequenced glioblastoma genome with a matched control. This read-depth algorithm, called BIC-seq, can accurately and efficiently identify CNVs via minimizing the Bayesian information criterion. Using BIC-seq, we identified hundreds of CNVs as small as 40 bp in the cancer genome sequenced at 10x coverage, whereas we could only detect large CNVs (>15 kb) in the array comparative genomic hybridization profiles for the same genome. Eighty percent (14/16) of the small variants tested (110 bp to 14 kb) were experimentally validated by quantitative PCR, demonstrating high sensitivity and true positive rate of the algorithm. We also extended the algorithm to detect recurrent CNVs in multiple samples as well as deriving error bars for breakpoints using a Gibbs sampling approach. We propose this statistical approach as a principled yet practical and efficient method to estimate CNVs in whole-genome sequencing data.
引用
收藏
页码:E1128 / E1136
页数:9
相关论文
共 50 条
  • [1] Clinical Validation of Whole-Genome Sequencing for the Detection of Copy Number Variation
    Thayanithy, V.
    Thyagarajan, B.
    Bower, M.
    Munro, S.
    Lam, H.
    Bray, S.
    Vivek, S.
    Schomaker, M.
    Daniel, J.
    Henzler, C.
    Nelson, A.
    Yohe, S.
    McIntyre, K.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2022, 24 (10): : S27 - S27
  • [2] Detecting Copy Number Variation from Whole-Genome Sequencing Data
    Jobanputra, V.
    Klein, R.
    Nahum, O.
    Yang, S.
    Ballinger, D.
    Beilharz, E.
    Levy, B.
    CYTOGENETIC AND GENOME RESEARCH, 2014, 142 (03)
  • [3] Genome-wide detection of copy number variation in American mink using whole-genome sequencing
    Pourya Davoudi
    Duy Ngoc Do
    Bruce Rathgeber
    Stefanie M. Colombo
    Mehdi Sargolzaei
    Graham Plastow
    Zhiquan Wang
    Karim Karimi
    Guoyu Hu
    Shafagh Valipour
    Younes Miar
    BMC Genomics, 23
  • [4] Genome-wide detection of copy number variation in American mink using whole-genome sequencing
    Davoudi, Pourya
    Duy Ngoc Do
    Rathgeber, Bruce
    Colombo, Stefanie M.
    Sargolzaei, Mehdi
    Plastow, Graham
    Wang, Zhiquan
    Karimi, Karim
    Hu, Guoyu
    Valipour, Shafagh
    Miar, Younes
    BMC GENOMICS, 2022, 23 (01)
  • [5] Exome sequencing and whole genome sequencing for the detection of copy number variation
    Hehir-Kwa, Jayne Y.
    Pfundt, Rolph
    Veltman, Joris A.
    EXPERT REVIEW OF MOLECULAR DIAGNOSTICS, 2015, 15 (08) : 1023 - 1032
  • [6] ConanVarvar: a versatile tool for the detection of large syndromic copy number variation from whole-genome sequencing data
    Gudkov, Mikhail
    Thibaut, Loic
    Khushi, Matloob
    Blue, Gillian M.
    Winlaw, David S.
    Dunwoodie, Sally L.
    Giannoulatou, Eleni
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [7] ConanVarvar: a versatile tool for the detection of large syndromic copy number variation from whole-genome sequencing data
    Mikhail Gudkov
    Loïc Thibaut
    Matloob Khushi
    Gillian M. Blue
    David S. Winlaw
    Sally L. Dunwoodie
    Eleni Giannoulatou
    BMC Bioinformatics, 24
  • [8] Investigation of copy number variation in subjects with major depression based on whole-genome sequencing data
    Yu, Chenglong
    Baune, Bernhard T.
    Wong, Ma-Li
    Licinio, Julio
    JOURNAL OF AFFECTIVE DISORDERS, 2017, 220 : 38 - 42
  • [9] Detection of Copy Number Variation Associated with Drug-Response Using Whole Genome Sequencing Data
    Loizidou, E.
    Bellos, E.
    Johnson, M.
    Coin, L.
    Prokopenko, I.
    HUMAN HEREDITY, 2015, 80 (03) : 117 - 117
  • [10] Comparison of different copy number variations (CNVs) detection tools using whole-genome sequencing (WGS) data
    Rodriguez Hidalgo, Maria
    Miranda, Jose I.
    Lara-Lopez, Araceli
    Reparaz-Bonilla, Iraia
    Irigoyen, Cristina
    Ruiz-Ederra, Javier
    Maynou, Joan
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2023, 31 : 294 - 294