Long-read-based human genomic structural variation detection with cuteSV

被引:187
作者
Jiang, Tao [1 ]
Liu, Yongzhuang [1 ]
Jiang, Yue [2 ]
Li, Junyi [3 ]
Gao, Yan [1 ]
Cui, Zhe [1 ]
Liu, Yadong [1 ]
Liu, Bo [1 ]
Wang, Yadong [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Ctr Bioinformat, Harbin 150001, Heilongjiang, Peoples R China
[2] Nebula Genom, Harbin 150030, Heilongjiang, Peoples R China
[3] Harbin Inst Technol Shenzhen, Sch Comp Sci & Technol, Shenzhen 518055, Guangdong, Peoples R China
关键词
Structural variants detection; Long-read sequencing; Scaling performance; PAIRED-END; IMPACT; DISCOVERY; INSERTION; VARIANTS; SEQUENCE;
D O I
10.1186/s13059-020-02107-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Long-read sequencing is promising for the comprehensive discovery of structural variations (SVs). However, it is still non-trivial to achieve high yields and performance simultaneously due to the complex SV signatures implied by noisy long reads. We propose cuteSV, a sensitive, fast, and scalable long-read-based SV detection approach. cuteSV uses tailored methods to collect the signatures of various types of SVs and employs a clustering-and-refinement method to implement sensitive SV detection. Benchmarks on simulated and real long-read sequencing datasets demonstrate that cuteSV has higher yields and scaling performance than state-of-the-art tools. cuteSV is available at https://github.com/tjiangHIT/cuteSV.
引用
收藏
页数:24
相关论文
共 53 条
  • [1] APPLICATIONS OF NEXT-GENERATION SEQUENCING Genome structural variation discovery and genotyping
    Alkan, Can
    Coe, Bradley P.
    Eichler, Evan E.
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (05) : 363 - 375
  • [2] Alsmadi O, 2014, PLOS ONE, V9
  • [3] Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome
    Bickhart, Derek M.
    Rosen, Benjamin D.
    Koren, Sergey
    Sayre, Brian L.
    Hastie, Alex R.
    Chan, Saki
    Lee, Joyce
    Lam, Ernest T.
    Liachko, Ivan
    Sullivan, Shawn T.
    Burton, Joshua N.
    Huson, Heather J.
    Nystrom, John C.
    Kelley, Christy M.
    Hutchison, Jana L.
    Zhou, Yang
    Sun, Jiajie
    Crisa, Alessandra
    de Leon, F. Abel Ponce
    Schwartz, John C.
    Hammond, John A.
    Waldbieser, Geoffrey C.
    Schroeder, Steven G.
    Liu, George E.
    Dunham, Maitreya J.
    Shendure, Jay
    Sonstegard, Tad S.
    Phillippy, Adam M.
    Van Tassell, Curtis P.
    Smith, Timothy P. L.
    [J]. NATURE GENETICS, 2017, 49 (04) : 643 - +
  • [4] VISOR: a versatile haplotype-aware structural variant simulator for short- and long-read sequencing
    Bolognini, Davide
    Sanders, Ashley
    Korbel, Jan O.
    Magi, Alberto
    Benes, Vladimir
    Rausch, Tobias
    [J]. BIOINFORMATICS, 2020, 36 (04) : 1267 - 1269
  • [5] Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory
    Chaisson, Mark J.
    Tesler, Glenn
    [J]. BMC BIOINFORMATICS, 2012, 13
  • [6] Multi-platform discovery of haplotype-resolved structural variation in human genomes
    Chaisson, Mark J. P.
    Sanders, Ashley D.
    Zhao, Xuefang
    Malhotra, Ankit
    Porubsky, David
    Rausch, Tobias
    Gardner, Eugene J.
    Rodriguez, Oscar L.
    Guo, Li
    Collins, Ryan L.
    Fan, Xian
    Wen, Jia
    Handsaker, Robert E.
    Fairley, Susan
    Kronenberg, Zev N.
    Kong, Xiangmeng
    Hormozdiari, Fereydoun
    Lee, Dillon
    Wenger, Aaron M.
    Hastie, Alex R.
    Antaki, Danny
    Anantharaman, Thomas
    Audano, Peter A.
    Brand, Harrison
    Cantsilieris, Stuart
    Cao, Han
    Cerveira, Eliza
    Chen, Chong
    Chen, Xintong
    Chin, Chen-Shan
    Chong, Zechen
    Chuang, Nelson T.
    Lambert, Christine C.
    Church, Deanna M.
    Clarke, Laura
    Farrell, Andrew
    Flores, Joey
    Galeev, Timur
    Gorkin, David U.
    Gujral, Madhusudan
    Guryev, Victor
    Heaton, William Haynes
    Korlach, Jonas
    Kumar, Sushant
    Kwon, Jee Young
    Lam, Ernest T.
    Lee, Jong Eun
    Lee, Joyce
    Lee, Wan-Ping
    Lee, Sau Peng
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [7] TIGRA: A targeted iterative graph routing assembler for breakpoint assembly
    Chen, Ken
    Chen, Lei
    Fan, Xian
    Wallis, John
    Ding, Li
    Weinstock, George
    [J]. GENOME RESEARCH, 2014, 24 (02) : 310 - 317
  • [8] Chen K, 2009, NAT METHODS, V6, P677, DOI [10.1038/NMETH.1363, 10.1038/nmeth.1363]
  • [9] The impact of structural variation on human gene expression
    Chiang, Colby
    Scott, Alexandra J.
    Davis, Joe R.
    Tsang, Emily K.
    Li, Xin
    Kim, Yungil
    Hadzic, Tarik
    Damani, Farhan N.
    Ganel, Liron
    Montgomery, Stephen B.
    Battle, Alexis
    Conrad, Donald F.
    Hall, Ira M.
    [J]. NATURE GENETICS, 2017, 49 (05) : 692 - +
  • [10] The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data
    Clarke, Laura
    Fairley, Susan
    Zheng-Bradley, Xiangqun
    Streeter, Ian
    Perry, Emily
    Lowy, Ernesto
    Tasse, Anne-Marie
    Flicek, Paul
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D854 - D859