A geometric approach for classification and comparison of structural variants

被引:117
作者
Sindi, Suzanne [1 ,2 ]
Helman, Elena [3 ]
Bashir, Ali [4 ]
Raphael, Benjamin J. [1 ,3 ]
机构
[1] Brown Univ, Ctr Computat Mol Biol, Providence, RI 02912 USA
[2] Brown Univ, Div Appl Math, Providence, RI 02912 USA
[3] Univ Calif San Diego, Dept Comp Sci, San Diego, CA 92103 USA
[4] Univ Calif San Diego, Bioinformat Grad Program, San Diego, CA 92103 USA
关键词
COPY-NUMBER POLYMORPHISM; COMPARATIVE GENOMIC HYBRIDIZATION; FINE-SCALE; ARRAY CGH; REARRANGEMENTS; ARCHITECTURE;
D O I
10.1093/bioinformatics/btp208
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Structural variants, including duplications, insertions, deletions and inversions of large blocks of DNA sequence, are an important contributor to human genome variation. Measuring structural variants in a genome sequence is typically more challenging than measuring single nucleotide changes. Current approaches for structural variant identification, including paired-end DNA sequencing/mapping and array comparative genomic hybridization (aCGH), do not identify the boundaries of variants precisely. Consequently, most reported human structural variants are poorly defined and not readily compared across different studies and measurement techniques. Results: We introduce Geometric Analysis of Structural Variants (GASV), a geometric approach for identification, classification and comparison of structural variants. This approach represents the uncertainty in measurement of a structural variant as a polygon in the plane, and identifies measurements supporting the same variant by computing intersections of polygons. We derive a computational geometry algorithm to efficiently identify all such intersections. We apply GASV to sequencing data from nine individual human genomes and several cancer genomes. We obtain better localization of the boundaries of structural variants, distinguish genetic from putative somatic structural variants in cancer genomes, and integrate aCGH and paired-end sequencing measurements of structural variants. This work presents the first general framework for comparing structural variants across multiple samples and measurement techniques, and will be useful for studies of both genetic structural variants and somatic rearrangements in cancer.
引用
收藏
页码:I222 / I230
页数:9
相关论文
共 36 条
[1]  
AERNI S, 2009, COMBINED ANAL COPY N
[2]   Evaluation of paired-end sequencing strategies for detection of genome rearrangements in cancer [J].
Bashir, Ali ;
Volik, Stanislav ;
Collins, Colin ;
Bafna, Vineet ;
Raphael, Benjamin J. .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (04)
[3]   Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing [J].
Campbell, Peter J. ;
Stephens, Philip J. ;
Pleasance, Erin D. ;
O'Meara, Sarah ;
Li, Heng ;
Santarius, Thomas ;
Stebbings, Lucy A. ;
Leroy, Catherine ;
Edkins, Sarah ;
Hardy, Claire ;
Teague, Jon W. ;
Menzies, Andrew ;
Goodhead, Ian ;
Turner, Daniel J. ;
Clee, Christopher M. ;
Quail, Michael A. ;
Cox, Antony ;
Brown, Clive ;
Durbin, Richard ;
Hurles, Matthew E. ;
Edwards, Paul A. W. ;
Bignell, Graham R. ;
Stratton, Michael R. ;
Futreal, P. Andrew .
NATURE GENETICS, 2008, 40 (06) :722-729
[4]   AN OPTIMAL ALGORITHM FOR INTERSECTING LINE SEGMENTS IN THE PLANE [J].
CHAZELLE, B ;
EDELSBRUNNER, H .
JOURNAL OF THE ACM, 1992, 39 (01) :1-54
[5]   A high-resolution survey of deletion polymorphism in the human genome [J].
Conrad, DF ;
Andrews, TD ;
Carter, NP ;
Hurles, ME ;
Pritchard, JK .
NATURE GENETICS, 2006, 38 (01) :75-81
[6]   Systematic assessment of copy number variant detection via genome-wide SNP genotyping [J].
Cooper, Gregory M. ;
Zerr, Troy ;
Kidd, Jeffrey M. ;
Eichler, Evan E. ;
Nickerson, Deborah A. .
NATURE GENETICS, 2008, 40 (10) :1199-1203
[7]   A portrait of copy-number polymorphism in Drosophila melanogaster [J].
Dopman, Erik B. ;
Hartl, Daniel L. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (50) :19920-19925
[8]   Recurrent DNA copy number variation in the laboratory mouse [J].
Egan, Chris M. ;
Sridhar, Srinath ;
Wigler, Michael ;
Hall, Ira M. .
NATURE GENETICS, 2007, 39 (11) :1384-1389
[9]   Natural selection shapes genome-wide patterns of copy-number polymorphism in Drosophila melanogaster [J].
Emerson, J. J. ;
Cardoso-Moreira, Margarida ;
Borevitz, Justin O. ;
Long, Manyuan .
SCIENCE, 2008, 320 (5883) :1629-1631
[10]   Hidden Markov models approach to the analysis of array CGH data [J].
Fridlyand, J ;
Snijders, AM ;
Pinkel, D ;
Albertson, DG ;
Jain, AN .
JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) :132-153