Comprehensive evaluation of structural variant genotyping methods based on long-read sequencing data

被引:12
作者
Duan, Xiaoke [1 ,2 ]
Pan, Mingpei [1 ,2 ]
Fan, Shaohua [1 ]
机构
[1] Fudan Univ, Zhangjiang Fudan Int Innovat Ctr, Human Phenome Inst, State Key Lab Genet Engn, Shanghai 200438, Peoples R China
[2] Fudan Univ, Sch Life Sci, Dept Anthropol & Human Genet, MOE Key Lab Contemporary Anthropol, Shanghai 200433, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Long-read sequencing; SV genotyping; F1; score; Performance evaluation; EVOLUTION; SELECTION; MUTATION; IMPACT;
D O I
10.1186/s12864-022-08548-y
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Structural variants (SVs) play a crucial role in gene regulation, trait association, and disease in humans. SV genotyping has been extensively applied in genomics research and clinical diagnosis. Although a growing number of SV genotyping methods for long reads have been developed, a comprehensive performance assessment of these methods has yet to be done. Results Based on one simulated and three real SV datasets, we performed an in-depth evaluation of five SV genotyping methods, including cuteSV, LRcaller, Sniffles, SVJedi, and VaPoR. The results show that for insertions and deletions, cuteSV and LRcaller have similar F1 scores (cuteSV, insertions: 0.69-0.90, deletions: 0.77-0.90 and LRcaller, insertions: 0.67-0.87, deletions: 0.74-0.91) and are superior to other methods. For duplications, inversions, and translocations, LRcaller yields the most accurate genotyping results (0.84, 0.68, and 0.47, respectively). When genotyping SVs located in tandem repeat region or with imprecise breakpoints, cuteSV (insertions and deletions) and LRcaller (duplications, inversions, and translocations) are better than other methods. In addition, we observed a decrease in F1 scores when the SV size increased. Finally, our analyses suggest that the F1 scores of these methods reach the point of diminishing returns at 20x depth of coverage. Conclusions We present an in-depth benchmark study of long-read SV genotyping methods. Our results highlight the advantages and disadvantages of each genotyping method, which provide practical guidance for optimal application selection and prospective directions for tool improvement.
引用
收藏
页数:14
相关论文
共 55 条
  • [1] New insights into the generation and role of de novo mutations in health and disease
    Acuna-Hidalgo, Rocio
    Veltman, Joris A.
    Hoischen, Alexander
    [J]. GENOME BIOLOGY, 2016, 17
  • [2] APPLICATIONS OF NEXT-GENERATION SEQUENCING Genome structural variation discovery and genotyping
    Alkan, Can
    Coe, Bradley P.
    Eichler, Evan E.
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (05) : 363 - 375
  • [3] Genome at Juncture of Early Human Migration: A Systematic Analysis of Two Whole Genomes and Thirteen Exomes from Kuwaiti Population Subgroup of Inferred Saudi Arabian Tribe Ancestry
    Alsmadi, Osama
    John, Sumi E.
    Thareja, Gaurav
    Hebbar, Prashantha
    Antony, Dinu
    Behbehani, Kazem
    Thanaraj, Thangavel Alphonse
    [J]. PLOS ONE, 2014, 9 (06):
  • [4] [Anonymous], 2018, GENET MED, DOI DOI 10.1038/gim.2017.86
  • [5] De novo structural mutation rates and gamete-of-origin biases revealed through genome sequencing of 2,396 families
    Belyeu, Jonathan R.
    Brand, Harrison
    Wang, Harold
    Zhao, Xuefang
    Pedersen, Brent S.
    Feusier, Julie
    Gupta, Meenal
    Nicholas, Thomas J.
    Brown, Joseph
    Baird, Lisa
    Devlin, Bernie
    Sanders, Stephan J.
    Jorde, Lynn B.
    Talkowski, Michael E.
    Quinlan, Aaron R.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2021, 108 (04) : 597 - 607
  • [6] Long-read sequencing of 3,622 Icelanders provides insight into the role of structural variants in human diseases and other traits
    Beyter, Doruk
    Ingimundardottir, Helga
    Oddsson, Asmundur
    Eggertsson, Hannes P.
    Bjornsson, Eythor
    Jonsson, Hakon
    Atlason, Bjarni A.
    Kristmundsdottir, Snaedis
    Mehringer, Svenja
    Hardarson, Marteinn T.
    Gudjonsson, Sigurjon A.
    Magnusdottir, Droplaug N.
    Jonasdottir, Aslaug
    Jonasdottir, Adalbjorg
    Kristjansson, Ragnar P.
    Sverrisson, Sverrir T.
    Holley, Guillaume
    Palsson, Gunnar
    Stefansson, Olafur A.
    Eyjolfsson, Gudmundur
    Olafsson, Isleifur
    Sigurdardottir, Olof
    Torfason, Bjarni
    Masson, Gisli
    Helgason, Agnar
    Thorsteinsdottir, Unnur
    Holm, Hilma
    Gudbjartsson, Daniel F.
    Sulem, Patrick
    Magnusson, Olafur T.
    Halldorsson, Bjarni, V
    Stefansson, Kari
    [J]. NATURE GENETICS, 2021, 53 (06) : 779 - +
  • [7] Evaluation of Germline Structural Variant Calling Methods for Nanopore Sequencing Data
    Bolognini, Davide
    Magi, Alberto
    [J]. FRONTIERS IN GENETICS, 2021, 12
  • [8] VISOR: a versatile haplotype-aware structural variant simulator for short- and long-read sequencing
    Bolognini, Davide
    Sanders, Ashley
    Korbel, Jan O.
    Magi, Alberto
    Benes, Vladimir
    Rausch, Tobias
    [J]. BIOINFORMATICS, 2020, 36 (04) : 1267 - 1269
  • [9] Frequency and Complexity of De Novo Structural Mutation in Autism
    Brandler, William M.
    Antaki, Danny
    Gujral, Madhusudan
    Noor, Amina
    Rosanio, Gabriel
    Chapman, Timothy R.
    Barrera, Daniel J.
    Lin, Guan Ning
    Malhotra, Dheeraj
    Watts, Amanda C.
    Wong, Lawrence C.
    Estabillo, Jasper A.
    Gadomski, Therese E.
    Hong, Oanh
    Fajardo, Karin V. Fuentes
    Bhandari, Abhishek
    Owen, Renius
    Baughn, Michael
    Yuan, Jeffrey
    Solomon, Terry
    Moyzis, Alexandra G.
    Maile, Michelle S.
    Sanders, Stephan J.
    Reiner, Gail E.
    Vaux, Keith K.
    Strom, Charles M.
    Zhang, Kang
    Muotri, Alysson R.
    Akshoomoff, Natacha
    Leal, Suzanne M.
    Pierce, Karen
    Courchesne, Eric
    Iakoucheva, Lilia M.
    Corse, Christina
    Sebat, Jonathan
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2016, 98 (04) : 667 - 679
  • [10] The potential and challenges of nanopore sequencing
    Branton, Daniel
    Deamer, David W.
    Marziali, Andre
    Bayley, Hagan
    Benner, Steven A.
    Butler, Thomas
    Di Ventra, Massimiliano
    Garaj, Slaven
    Hibbs, Andrew
    Huang, Xiaohua
    Jovanovich, Stevan B.
    Krstic, Predrag S.
    Lindsay, Stuart
    Ling, Xinsheng Sean
    Mastrangelo, Carlos H.
    Meller, Amit
    Oliver, John S.
    Pershin, Yuriy V.
    Ramsey, J. Michael
    Riehn, Robert
    Soni, Gautam V.
    Tabard-Cossa, Vincent
    Wanunu, Meni
    Wiggin, Matthew
    Schloss, Jeffery A.
    [J]. NATURE BIOTECHNOLOGY, 2008, 26 (10) : 1146 - 1153