DELLY: structural variant discovery by integrated paired-end and split-read analysis

被引:1498
作者
Rausch, Tobias [1 ]
Zichner, Thomas
Schlattl, Andreas
Stuetz, Adrian M.
Benes, Vladimir [1 ]
Korbel, Jan O. [2 ]
机构
[1] European Mol Biol Lab, Core Facil & Serv, D-69117 Heidelberg, Germany
[2] EMBL European Bioinformat Inst, Cambridge CB10 1SD, England
关键词
COPY NUMBER VARIATION; CANCER GENOME; RESOLUTION; REARRANGEMENTS; POLYMORPHISM; INSERTIONS; NUCLEOTIDE; EXPRESSION;
D O I
10.1093/bioinformatics/bts378
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: The discovery of genomic structural variants (SVs) at high sensitivity and specificity is an essential requirement for characterizing naturally occurring variation and for understanding pathological somatic rearrangements in personal genome sequencing data. Of particular interest are integrated methods that accurately identify simple and complex rearrangements in heterogeneous sequencing datasets at single-nucleotide resolution, as an optimal basis for investigating the formation mechanisms and functional consequences of SVs. Results: We have developed an SV discovery method, called DELLY, that integrates short insert paired-ends, long-range mate-pairs and split-read alignments to accurately delineate genomic rearrangements at single-nucleotide resolution. DELLY is suitable for detecting copy-number variable deletion and tandem duplication events as well as balanced rearrangements such as inversions or reciprocal translocations. DELLY, thus, enables to ascertain the full spectrum of genomic rearrangements, including complex events. On simulated data, DELLY compares favorably to other SV prediction methods across a wide range of sequencing parameters. On real data, DELLY reliably uncovers SVs from the 1000 Genomes Project and cancer genomes, and validation experiments of randomly selected deletion loci show a high specificity.
引用
收藏
页码:I333 / I339
页数:7
相关论文
共 31 条
  • [1] CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing
    Abyzov, Alexej
    Urban, Alexander E.
    Snyder, Michael
    Gerstein, Mark
    [J]. GENOME RESEARCH, 2011, 21 (06) : 974 - 984
  • [2] AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision
    Abyzov, Alexej
    Gerstein, Mark
    [J]. BIOINFORMATICS, 2011, 27 (05) : 595 - 603
  • [3] A map of human genome variation from population-scale sequencing
    Altshuler, David
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Collins, Francis S.
    De la Vega, Francisco M.
    Donnelly, Peter
    Egholm, Michael
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Knoppers, Bartha M.
    Lander, Eric S.
    Lehrach, Hans
    Mardis, Elaine R.
    McVean, Gil A.
    Nickerson, DebbieA.
    Peltonen, Leena
    Schafer, Alan J.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Deiros, David
    Metzker, Mike
    Muzny, Donna
    Reid, Jeff
    Wheeler, David
    Wang, Jun
    Li, Jingxiang
    Jian, Min
    Li, Guoqing
    Li, Ruiqiang
    Liang, Huiqing
    Tian, Geng
    Wang, Bo
    Wang, Jian
    Wang, Wei
    Yang, Huanming
    Zhang, Xiuqing
    Zheng, Huisong
    Lander, Eric S.
    Altshuler, David L.
    Ambrogio, Lauren
    Bloom, Toby
    Cibulskis, Kristian
    Fennell, Tim J.
    Gabriel, Stacey B.
    [J]. NATURE, 2010, 467 (7319) : 1061 - 1073
  • [4] [Anonymous], P 1 ANN INT C RES CO
  • [5] BamTools: a C++ API and toolkit for analyzing and managing BAM files
    Barnett, Derek W.
    Garrison, Erik K.
    Quinlan, Aaron R.
    Stroemberg, Michael P.
    Marth, Gabor T.
    [J]. BIOINFORMATICS, 2011, 27 (12) : 1691 - 1692
  • [6] Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing
    Campbell, Peter J.
    Stephens, Philip J.
    Pleasance, Erin D.
    O'Meara, Sarah
    Li, Heng
    Santarius, Thomas
    Stebbings, Lucy A.
    Leroy, Catherine
    Edkins, Sarah
    Hardy, Claire
    Teague, Jon W.
    Menzies, Andrew
    Goodhead, Ian
    Turner, Daniel J.
    Clee, Christopher M.
    Quail, Michael A.
    Cox, Antony
    Brown, Clive
    Durbin, Richard
    Hurles, Matthew E.
    Edwards, Paul A. W.
    Bignell, Graham R.
    Stratton, Michael R.
    Futreal, P. Andrew
    [J]. NATURE GENETICS, 2008, 40 (06) : 722 - 729
  • [7] Chen K, 2009, NAT METHODS, V6, P677, DOI [10.1038/NMETH.1363, 10.1038/nmeth.1363]
  • [8] High-resolution mapping of copy-number alterations with massively parallel sequencing
    Chiang, Derek Y.
    Getz, Gad
    Jaffe, David B.
    O'Kelly, Michael J. T.
    Zhao, Xiaojun
    Carter, Scott L.
    Russ, Carsten
    Nusbaum, Chad
    Meyerson, Matthew
    Lander, Eric S.
    [J]. NATURE METHODS, 2009, 6 (01) : 99 - 103
  • [9] Origins and functional impact of copy number variation in the human genome
    Conrad, Donald F.
    Pinto, Dalila
    Redon, Richard
    Feuk, Lars
    Gokcumen, Omer
    Zhang, Yujun
    Aerts, Jan
    Andrews, T. Daniel
    Barnes, Chris
    Campbell, Peter
    Fitzgerald, Tomas
    Hu, Min
    Ihm, Chun Hwa
    Kristiansson, Kati
    MacArthur, Daniel G.
    MacDonald, Jeffrey R.
    Onyiah, Ifejinelo
    Pang, Andy Wing Chun
    Robson, Sam
    Stirrups, Kathy
    Valsesia, Armand
    Walter, Klaudia
    Wei, John
    Tyler-Smith, Chris
    Carter, Nigel P.
    Lee, Charles
    Scherer, Stephen W.
    Hurles, Matthew E.
    [J]. NATURE, 2010, 464 (7289) : 704 - 712
  • [10] SeqAn An efficient, generic C++ library for sequence analysis
    Doering, Andreas
    Weese, David
    Rausch, Tobias
    Reinert, Knut
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)