BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers

被引:142
作者
Abo, Ryan P. [1 ,2 ,3 ]
Ducar, Matthew [1 ,2 ,3 ,4 ]
Garcia, Elizabeth P.
Thorner, Aaron R. [1 ,2 ,3 ]
Rojas-Rudilla, Vanesa [4 ]
Lin, Ling [1 ,2 ,3 ]
Sholl, Lynette M. [4 ]
Hahn, William C. [1 ,2 ,3 ,5 ,6 ]
Meyerson, Matthew [1 ,2 ,3 ,4 ,5 ,6 ]
Lindeman, Neal I. [4 ]
Van Hummelen, Paul [1 ,2 ,3 ]
MacConaill, Laura E. [1 ,2 ,3 ,4 ]
机构
[1] Dana Farber Canc Inst, Ctr Canc Genome Discovery, Boston, MA 02215 USA
[2] Dana Farber Canc Inst, Dept Med Oncol, Boston, MA 02215 USA
[3] Harvard Univ, Sch Med, Boston, MA 02215 USA
[4] Brigham & Womens Hosp, Dept Pathol, Boston, MA 02215 USA
[5] Broad Inst Harvard, Cambridge, MA 02141 USA
[6] MIT, Cambridge, MA 02141 USA
关键词
ACUTE MYELOID-LEUKEMIA; READ ALIGNMENT; CANCER GENOMES; TRANSLOCATIONS; GENE; IDENTIFICATION; LANDSCAPES; RESOLUTION;
D O I
10.1093/nar/gku1211
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings.
引用
收藏
页数:13
相关论文
共 40 条
  • [1] APPLICATIONS OF NEXT-GENERATION SEQUENCING Genome structural variation discovery and genotyping
    Alkan, Can
    Coe, Bradley P.
    Eichler, Evan E.
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (05) : 363 - 375
  • [2] A NEW METHOD FOR FLUORESCENCE MICROSCOPICAL LOCALIZATION OF SPECIFIC DNA-SEQUENCES BY INSITU HYBRIDIZATION OF FLUOROCHROME-LABELED RNA
    BAUMAN, JGJ
    WIEGANT, J
    BORST, P
    VANDUIJN, P
    [J]. EXPERIMENTAL CELL RESEARCH, 1980, 128 (02) : 485 - 490
  • [3] A comparative analysis of FISH, RT-PCR, PCR, and immunohistochemistry for the diagnosis of mantle cell lymphomas
    Belaud-Rotureau, MA
    Parrens, M
    Dubus, P
    Garroste, JC
    de Mascarel, A
    Merlio, JP
    [J]. MODERN PATHOLOGY, 2002, 15 (05) : 517 - 525
  • [4] Genomic sequencing of meningiomas identifies oncogenic SMO and AKT1 mutations
    Brastianos, Priscilla K.
    Horowitz, Peleg M.
    Santagata, Sandro
    Jones, Robert T.
    McKenna, Aaron
    Getz, Gad
    Ligon, Keith L.
    Palescandolo, Emanuele
    Van Hummelen, Paul
    Ducar, Matthew D.
    Raza, Alina
    Sunkavalli, Ashwini
    MacConaill, Laura E.
    Stemmer-Rachamimov, Anat O.
    Louis, David N.
    Hahn, William C.
    Dunn, Ian F.
    Beroukhim, Rameen
    [J]. NATURE GENETICS, 2013, 45 (03) : 285 - 289
  • [5] End-joining, translocations and cancer
    Bunting, Samuel F.
    Nussenzweig, Andre
    [J]. NATURE REVIEWS CANCER, 2013, 13 (07) : 443 - 454
  • [6] Chen K, 2009, NAT METHODS, V6, P677, DOI [10.1038/nmeth.1363, 10.1038/NMETH.1363]
  • [7] DNA Sequencing of Cancer: What Have We Learned?
    Chmielecki, Juliann
    Meyerson, Matthew
    [J]. ANNUAL REVIEW OF MEDICINE, VOL 65, 2014, 65 : 63 - 79
  • [8] Structural variation in the human genome
    Feuk, L
    Carson, AR
    Scherer, SW
    [J]. NATURE REVIEWS GENETICS, 2006, 7 (02) : 85 - 97
  • [9] SoftSearch: Integration of Multiple Sequence Features to Identify Breakpoints of Structural Variations
    Hart, Steven N.
    Sarangi, Vivekananda
    Moore, Raymond
    Baheti, Saurabh
    Bhavsar, Jaysheel D.
    Couch, Fergus J.
    Kocher, Jean-Pierre A.
    [J]. PLOS ONE, 2013, 8 (12):
  • [10] Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202. Article published online before March 2002, 10.1101/gr.229202]