Whole Animal Genome Sequencing: user-friendly, rapid, containerized pipelines for processing, variant discovery, and annotation of short-read whole genome sequencing data

被引:11
作者
Cullen, Jonah N. [1 ]
Friedenberg, Steven G. [1 ]
机构
[1] Univ Minnesota, Coll Vet Med, Dept Vet Clin Sci, 1352 Boyd Ave, St Paul, MN 55108 USA
来源
G3-GENES GENOMES GENETICS | 2023年 / 13卷 / 08期
基金
美国农业部;
关键词
whole genome sequencing; pipeline; variants;
D O I
10.1093/g3journal/jkad117
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Advancements in massively parallel short-read sequencing technologies and the associated decreasing costs have led to large and diverse variant discovery efforts across species. However, processing high-throughput short-read sequencing data can be challenging with potential pitfalls and bioinformatics bottlenecks in generating reproducible results. Although a number of pipelines exist that address these challenges, these are often geared toward human or traditional model organism species and can be difficult to configure across institutions. Whole Animal Genome Sequencing (WAGS) is an open-source set of user-friendly, containerized pipelines designed to simplify the process of identifying germline short (SNP and indel) and structural variants (SVs) geared toward the veterinary community but adaptable to any species with a suitable reference genome. We present a description of the pipelines [adapted from the best practices of the Genome Analysis Toolkit (GATK)], along with benchmarking data from both the preprocessing and joint genotyping steps, consistent with a typical user workflow.
引用
收藏
页数:6
相关论文
共 41 条
  • [1] JWES: a new pipeline for whole genome/exome sequence data processing, management, and gene-variant discovery, annotation, prediction, and genotyping
    Ahmed, Zeeshan
    Renart, Eduard Gibert
    Mishra, Deepshikha
    Zeeshan, Saman
    [J]. FEBS OPEN BIO, 2021, 11 (09): : 2441 - 2452
  • [2] A global reference for human genetic variation
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Wang, Jun
    Wilson, Richard K.
    Boerwinkle, Eric
    Doddapaneni, Harsha
    Han, Yi
    Korchina, Viktoriya
    Kovar, Christie
    Lee, Sandra
    Muzny, Donna
    Reid, Jeffrey G.
    Zhu, Yiming
    Chang, Yuqi
    Feng, Qiang
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Lan, Tianming
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Liu, Shengmao
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Tang, Meifang
    Wang, Bo
    [J]. NATURE, 2015, 526 (7571) : 68 - +
  • [3] A new domestic cat genome assembly based on long sequence reads empowers feline genomic medicine and identifies a novel gene for dwarfism
    Buckley, Reuben M.
    Davis, Brian W.
    Brashear, Wesley A.
    Farias, Fabiana H. G.
    Kuroki, Kei
    Graves, Tina
    Hillier, LaDeana W.
    Kremitzki, Milinn
    Li, Gang
    Middleton, Rondo P.
    Minx, Patrick
    Tomlinson, Chad
    Lyons, Leslie A.
    Murphy, William J.
    Warren, Wesley C.
    [J]. PLOS GENETICS, 2020, 16 (10):
  • [4] GRIDSS2: comprehensive characterisation of somatic structural variation using single breakend variants and structural variant phasing
    Cameron, Daniel L.
    Baber, Jonathan
    Shale, Charles
    Valle-Inclan, Jose Espejo
    Besselink, Nicolle
    van Hoeck, Arne
    Janssen, Roel
    Cuppen, Edwin
    Priestley, Peter
    Papenfuss, Anthony T.
    [J]. GENOME BIOLOGY, 2021, 22 (01)
  • [5] GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly
    Cameron, Daniel L.
    Schroder, Jan
    Penington, Jocelyn Sietsma
    Do, Hongdo
    Molania, Ramyar
    Dobrovic, Alexander
    Speed, Terence P.
    Papenfuss, Anthony T.
    [J]. GENOME RESEARCH, 2017, 27 (12) : 2050 - 2060
  • [6] DNAp: A Pipeline for DNA-seq Data Analysis
    Causey, Jason L.
    Ashby, Cody
    Walker, Karl
    Wang, Zhiping Paul
    Yang, Mary
    Guan, Yuanfang
    Moore, Jason H.
    Huang, Xiuzhen
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [7] Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications
    Chen, Xiaoyu
    Schulz-Trieglaff, Ole
    Shaw, Richard
    Barnes, Bret
    Schlesinger, Felix
    Kallberg, Morten
    Cox, Anthony J.
    Kruglyakl, Semyon
    Saunders, Christopher T.
    [J]. BIOINFORMATICS, 2016, 32 (08) : 1220 - 1222
  • [8] Chiang C, 2015, NAT METHODS, V12, P966, DOI [10.1038/NMETH.3505, 10.1038/nmeth.3505]
  • [9] Whole-genome sequencing of 234 bulls facilitates mapping of monogenic and complex traits in cattle
    Daetwyler, Hans D.
    Capitan, Aurelien
    Pausch, Hubert
    Stothard, Paul
    Van Binsbergen, Rianne
    Brondum, Rasmus F.
    Liao, Xiaoping
    Djari, Anis
    Rodriguez, Sabrina C.
    Grohs, Cecile
    Esquerre, Diane
    Bouchez, Olivier
    Rossignol, Marie-Noelle
    Klopp, Christophe
    Rocha, Dominique
    Fritz, Sebastien
    Eggen, Andre
    Bowman, Phil J.
    Coote, David
    Chamberlain, Amanda J.
    Anderson, Charlotte
    VanTassell, Curt P.
    Hulsegge, Ina
    Goddard, Mike E.
    Guldbrandtsen, Bernt
    Lund, Mogens S.
    Veerkamp, Roel F.
    Boichard, Didier A.
    Fries, Ruedi
    Hayes, Ben J.
    [J]. NATURE GENETICS, 2014, 46 (08) : 858 - 865
  • [10] Twelve years of SAMtools and BCFtools
    Danecek, Petr
    Bonfield, James K.
    Liddle, Jennifer
    Marshall, John
    Ohan, Valeriu
    Pollard, Martin O.
    Whitwham, Andrew
    Keane, Thomas
    McCarthy, Shane A.
    Davies, Robert M.
    Li, Heng
    [J]. GIGASCIENCE, 2021, 10 (02):