RaGOO: fast and accurate reference-guided scaffolding of draft genomes

被引:412
作者
Alonge, Michael [1 ]
Soyk, Sebastian [2 ]
Ramakrishnan, Srividya [1 ]
Wang, Xingang [2 ]
Goodwin, Sara [2 ]
Sedlazeck, Fritz J. [3 ]
Lippman, Zachary B. [2 ,4 ]
Schatz, Michael C. [1 ,2 ,5 ]
机构
[1] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
[2] Cold Spring Harbor Lab, POB 100, Cold Spring Harbor, NY 11724 USA
[3] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[4] Cold Spring Harbor Lab, Howard Hughes Med Inst, Cold Spring Harbor, NY 11724 USA
[5] Johns Hopkins Univ, Dept Biol, Baltimore, MD 21218 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Pseudomolecule; Reference-guided; Genome assembly; Scaffolding; Genome alignment; Long-read sequencing; Tomato; READ ALIGNMENT; ANNOTATION; RNA; PROVIDES;
D O I
10.1186/s13059-019-1829-6
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at https://github.com/malonge/RaGOO.
引用
收藏
页数:17
相关论文
共 57 条
  • [1] Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing
    Aflitos, Saulo
    Schijlen, Elio
    de Jong, Hans
    de Ridder, Dick
    Smit, Sandra
    Finkers, Richard
    Wang, Jun
    Zhang, Gengyun
    Li, Ning
    Mao, Likai
    Bakker, Freek
    Dirks, Rob
    Breit, Timo
    Gravendeel, Barbara
    Huits, Henk
    Struss, Darush
    Swanson-Wagner, Ruth
    van Leeuwen, Hans
    van Ham, Roeland C. H. J.
    Fito, Laia
    Guignier, Laetitia
    Sevilla, Myrna
    Ellul, Philippe
    Ganko, Eric
    Kapur, Arvind
    Reclus, Emannuel
    de Geus, Bernard
    van de Geest, Henri
    te Lintel Hekkert, Bas
    van Haarst, Jan
    Smits, Lars
    Koops, Andries
    Sanchez-Perez, Gabino
    van Heusden, Adriaan W.
    Visser, Richard
    Quan, Zhiwu
    Min, Jiumeng
    Liao, Li
    Wang, Xiaoli
    Wang, Guangbiao
    Yue, Zhen
    Yang, Xinhua
    Xu, Na
    Schranz, Eric
    Smets, Erik
    Vos, Rutger
    Rauwerda, Johan
    Ursem, Remco
    Schuit, Cees
    Kerns, Mike
    [J]. PLANT JOURNAL, 2014, 80 (01) : 136 - 148
  • [2] Multi-genome Scaffold Co-assembly Based on the Analysis of Gene Orders and Genomic Repeats
    Aganezov, Sergey
    Alekseyev, Max A.
    [J]. BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2016, 2016, 9683 : 237 - 249
  • [3] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [4] SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
    Bankevich, Anton
    Nurk, Sergey
    Antipov, Dmitry
    Gurevich, Alexey A.
    Dvorkin, Mikhail
    Kulikov, Alexander S.
    Lesin, Valery M.
    Nikolenko, Sergey I.
    Son Pham
    Prjibelski, Andrey D.
    Pyshkin, Alexey V.
    Sirotkin, Alexander V.
    Vyahhi, Nikolay
    Tesler, Glenn
    Alekseyev, Max A.
    Pevzner, Pavel A.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) : 455 - 477
  • [5] UniProt: the universal protein knowledgebase
    Bateman, Alex
    Martin, Maria Jesus
    O'Donovan, Claire
    Magrane, Michele
    Alpi, Emanuele
    Antunes, Ricardo
    Bely, Benoit
    Bingley, Mark
    Bonilla, Carlos
    Britto, Ramona
    Bursteinas, Borisas
    Bye-A-Jee, Hema
    Cowley, Andrew
    Da Silva, Alan
    De Giorgi, Maurizio
    Dogan, Tunca
    Fazzini, Francesco
    Castro, Leyla Garcia
    Figueira, Luis
    Garmiri, Penelope
    Georghiou, George
    Gonzalez, Daniel
    Hatton-Ellis, Emma
    Li, Weizhong
    Liu, Wudong
    Lopez, Rodrigo
    Luo, Jie
    Lussi, Yvonne
    MacDougall, Alistair
    Nightingale, Andrew
    Palka, Barbara
    Pichler, Klemens
    Poggioli, Diego
    Pundir, Sangya
    Pureza, Luis
    Qi, Guoying
    Rosanoff, Steven
    Saidi, Rabie
    Sawford, Tony
    Shypitsyna, Aleksandra
    Speretta, Elena
    Turner, Edward
    Tyagi, Nidhi
    Volynkin, Vladimir
    Wardell, Tony
    Warner, Kate
    Watkins, Xavier
    Zaru, Rossana
    Zellner, Hermann
    Xenarios, Ioannis
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D158 - D169
  • [6] Genomic variation in tomato, from wild ancestors to contemporary breeding accessions
    Blanca, Jose
    Montero-Pau, Javier
    Sauvage, Christopher
    Bauchet, Guillaume
    Illa, Eudald
    Jose Diez, Maria
    Francis, David
    Causse, Mathilde
    van der Knaap, Esther
    Canizares, Joaquin
    [J]. BMC GENOMICS, 2015, 16
  • [7] The genome of the stress-tolerant wild tomato species Solanum pennellii
    Bolger, Anthony
    Scossa, Federico
    Bolger, Marie E.
    Lanz, Christa
    Maumus, Florian
    Tohge, Takayuki
    Quesneville, Hadi
    Alseekh, Saleh
    Sorensen, Iben
    Lichtenstein, Gabriel
    Fich, Eric A.
    Conte, Mariana
    Keller, Heike
    Schneeberger, Korbinian
    Schwacke, Rainer
    Ofner, Itai
    Vrebalov, Julia
    Xu, Yimin
    Osorio, Sonia
    Aflitos, Saulo Alves
    Schijlen, Elio
    Jimenez-Gomez, Jose M.
    Ryngajllo, Malgorzata
    Kimura, Seisuke
    Kumar, Ravi
    Koenig, Daniel
    Headland, Lauren R.
    Maloof, Julin N.
    Sinha, Neelima
    van Ham, Roeland C. H. J.
    Lankhorst, Rene Klein
    Mao, Linyong
    Vogel, Alexander
    Arsova, Borjana
    Panstruga, Ralph
    Fei, Zhangjun
    Rose, Jocelyn K. C.
    Zamir, Dani
    Carrari, Fernando F
    Giovannoni, James J.
    Weigel, Detlef
    Usadel, Bjoern
    Fernie, Alisdair R.
    [J]. NATURE GENETICS, 2014, 46 (09) : 1034 - +
  • [8] Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions
    Burton, Joshua N.
    Adey, Andrew
    Patwardhan, Rupali P.
    Qiu, Ruolan
    Kitzman, Jacob O.
    Shendure, Jay
    [J]. NATURE BIOTECHNOLOGY, 2013, 31 (12) : 1119 - +
  • [9] MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes
    Cantarel, Brandi L.
    Korf, Ian
    Robb, Sofia M. C.
    Parra, Genis
    Ross, Eric
    Moore, Barry
    Holt, Carson
    Alvarado, Alejandro Sanchez
    Yandell, Mark
    [J]. GENOME RESEARCH, 2008, 18 (01) : 188 - 196
  • [10] Whole-genome sequencing of multiple Arabidopsis thaliana populations
    Cao, Jun
    Schneeberger, Korbinian
    Ossowski, Stephan
    Guenther, Torsten
    Bender, Sebastian
    Fitz, Joffrey
    Koenig, Daniel
    Lanz, Christa
    Stegle, Oliver
    Lippert, Christoph
    Wang, Xi
    Ott, Felix
    Mueller, Jonas
    Alonso-Blanco, Carlos
    Borgwardt, Karsten
    Schmid, Karl J.
    Weigel, Detlef
    [J]. NATURE GENETICS, 2011, 43 (10) : 956 - U60