High-fidelity (repeat) consensus sequences from short reads using combined read clustering and assembly

被引:3
作者
Mann, Ludwig [1 ]
Balasch, Kristin [1 ]
Schmidt, Nicola [1 ]
Heitkam, Tony [1 ,2 ]
机构
[1] Tech Univ Dresden, Fac Biol, D-01069 Dresden, Germany
[2] Karl Franzens Univ Graz, Inst Biol, NAWI Graz, A-8010 Graz, Austria
关键词
Repetitive DNA; Transposable elements; Consensus sequences; Repeat assembly; Repeat clustering; eccDNA; Ribosomal DNA; rDNA; Non-model organisms; MALE-FERTILE; GENOME; DNA; TRANSCRIPTION; PLANTS;
D O I
10.1186/s12864-023-09948-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundDespite the many cheap and fast ways to generate genomic data, good and exact genome assembly is still a problem, with especially the repeats being vastly underrepresented and often misassembled. As short reads in low coverage are already sufficient to represent the repeat landscape of any given genome, many read cluster algorithms were brought forward that provide repeat identification and classification. But how can trustworthy, reliable and representative repeat consensuses be derived from unassembled genomes?ResultsHere, we combine methods from repeat identification and genome assembly to derive these robust consensuses. We test several use cases, such as (1) consensus building from clustered short reads of non-model genomes, (2) from genome-wide amplification setups, and (3) specific repeat-centred questions, such as the linked vs. unlinked arrangement of ribosomal genes. In all our use cases, the derived consensuses are robust and representative. To evaluate overall performance, we compare our high-fidelity repeat consensuses to RepeatExplorer2-derived contigs and check, if they represent real transposable elements as found in long reads. Our results demonstrate that it is possible to generate useful, reliable and trustworthy consensuses from short reads by a combination from read cluster and genome assembly methods in an automatable way.ConclusionWe anticipate that our workflow opens the way towards more efficient and less manual repeat characterization and annotation, benefitting all genome studies, but especially those of non-model organisms.
引用
收藏
页数:11
相关论文
共 49 条
  • [11] Satellite DNA in Plants: More than Just Rubbish
    Garrido-Ramos, Manuel A.
    [J]. CYTOGENETIC AND GENOME RESEARCH, 2015, 146 (02) : 153 - 170
  • [12] Transposable elements as essential elements in the control of gene expression
    Gebrie, Alemu
    [J]. MOBILE DNA, 2023, 14 (01)
  • [13] DNA-SEQUENCE AND TRANSCRIPTION OF A DNA MINICIRCLE ISOLATED FROM MALE-FERTILE SUGAR-BEET MITOCHONDRIA
    HANSEN, BM
    MARCKER, KA
    [J]. NUCLEIC ACIDS RESEARCH, 1984, 12 (11) : 4747 - 4756
  • [14] Circular DNA in the human germline and its association with recombination
    Henriksen, Rasmus Amund
    Jenjaroenpun, Piroon
    Sjostrom, Ida Borup
    Jensen, Kristian Reveles
    Prada-Luengo, Inigo
    Wongsurawat, Thidathip
    Nookaew, Intawat
    Regenberg, Birgitte
    [J]. MOLECULAR CELL, 2022, 82 (01) : 209 - +
  • [15] Organisation of the plant genome in chromosomes
    Heslop-Harrison, J. S.
    Schwarzacher, Trude
    [J]. PLANT JOURNAL, 2011, 66 (01) : 18 - 33
  • [16] Hostakova N, 2023, Domain based annotation of transposable elements-DANTE
  • [17] The giant diploid faba genome unlocks variation in a global protein crop
    Jayakodi, Murukarthick
    Golicz, Agnieszka A.
    Kreplak, Jonathan
    Fechete, Lavinia, I
    Angra, Deepti
    Bednar, Petr
    Bornhofen, Elesandro
    Zhang, Hailin
    Boussageon, Raphael
    Kaur, Sukhjiwan
    Cheung, Kwok
    Cizkova, Jana
    Gundlach, Heidrun
    Hallab, Asis
    Imbert, Baptiste
    Keeble-Gagnere, Gabriel
    Koblizkova, Andrea
    Kobrlova, Lucie
    Krejci, Petra
    Mouritzen, Troels W.
    Neumann, Pavel
    Nadzieja, Marcin
    Nielsen, Linda Kaergaard
    Novak, Petr
    Orabi, Jihad
    Padmarasu, Sudharsan
    Robertson-Shersby-Harvie, Tom
    Robledillo, Laura Avila
    Schiemann, Andrea
    Tanskanen, Jaakko
    Toronen, Petri
    Warsame, Ahmed O.
    Wittenberg, Alexander H. J.
    Himmelbach, Axel
    Aubert, Gregoire
    Courty, Pierre-Emmanuel
    Dolezel, Jaroslav
    Holm, Liisa U.
    Janss, Luc L.
    Khazaei, Hamid
    Macas, Jiri
    Mascher, Martin
    Smykal, Petr
    Snowdon, Rod J.
    Stein, Nils
    Stoddard, Frederick L.
    Stougaard, Jens
    Tayeh, Nadim
    Torres, Ana M.
    Usadel, Bjorn
    [J]. NATURE, 2023, 615 (7953) : 652 - +
  • [18] Extrachromosomal circular DNA drives oncogenic genome remodeling in neuroblastoma
    Koche, Richard P.
    Rodriguez-Fos, Elias
    Helmsauer, Konstantin
    Burkert, Martin
    MacArthur, Ian C.
    Maag, Jesper
    Chamorro, Rocio
    Munoz-Perez, Natalia
    Puiggros, Montserrat
    Dorado Garcia, Heathcliff
    Bei, Yi
    Roeefzaad, Claudia
    Bardinet, Victor
    Szymansky, Annabell
    Winkler, Annika
    Thole, Theresa
    Timme, Natalie
    Kasack, Katharina
    Fuchs, Steffen
    Klironomos, Filippos
    Thiessen, Nina
    Blanc, Eric
    Schmelz, Karin
    Kuenkele, Annette
    Hundsdoerfer, Patrick
    Rosswog, Carolina
    Theissen, Jessica
    Beule, Dieter
    Deubzer, Hedwig
    Sauer, Sascha
    Toedling, Joern
    Fischer, Matthias
    Hertwig, Falk
    Schwarz, Roland F.
    Eggert, Angelika
    Torrents, David
    Schulte, Johannes H.
    Henssen, Anton G.
    [J]. NATURE GENETICS, 2020, 52 (01) : 29 - +
  • [19] Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]
  • [20] Mechanism of chimera formation during the Multiple Displacement Amplification reaction
    Lasken R.S.
    Stockwell T.B.
    [J]. BMC Biotechnology, 7 (1)