A high-quality de novo genome assembly based on nanopore sequencing of a wild-caught coconut rhinoceros beetle (Oryctes rhinoceros)

被引:10
作者
Filipovic, Igor [1 ,2 ]
Rasic, Gordana [2 ]
Hereward, James [1 ]
Gharuka, Maria [3 ]
Devine, Gregor J. [2 ]
Furlong, Michael J. [1 ]
Etebari, Kayvan [1 ]
机构
[1] Univ Queensland, Sch Biol Sci, St Lucia, Qld, Australia
[2] QIMR Berghofer Med Res Inst, Mosquito Control Lab, Brisbane, Qld, Australia
[3] Minist Agr & Livestock, Div Res, Honiara, Solomon Islands
关键词
Genome assembly; Genome annotation; Single insect nanopore sequencing; Oryctes rhinoceros; Coleoptera; ANNOTATION; GENERATION; ALIGNMENT; MOSQUITO; GENES; BUSCO;
D O I
10.1186/s12864-022-08628-z
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background An optimal starting point for relating genome function to organismal biology is a high-quality nuclear genome assembly, and long-read sequencing is revolutionizing the production of this genomic resource in insects. Despite this, nuclear genome assemblies have been under-represented for agricultural insect pests, particularly from the order Coleoptera. Here we present a de novo genome assembly and structural annotation for the coconut rhinoceros beetle, Oryctes rhinoceros (Coleoptera: Scarabaeidae), based on Oxford Nanopore Technologies (ONT) long-read data generated from a wild-caught female, as well as the assembly process that also led to the recovery of the complete circular genome assemblies of the beetle's mitochondrial genome and that of the biocontrol agent, Oryctes rhinoceros nudivirus (OrNV). As an invasive pest of palm trees, O. rhinoceros is undergoing an expansion in its range across the Pacific Islands, requiring new approaches to management that may include strategies facilitated by genome assembly and annotation. Results High-quality DNA isolated from an adult female was used to create four ONT libraries that were sequenced using four MinION flow cells, producing a total of 27.2 Gb of high-quality long-read sequences. We employed an iterative assembly process and polishing with one lane of high-accuracy Illumina reads, obtaining a final size of the assembly of 377.36 Mb that had high contiguity (fragment N50 length = 12 Mb) and accuracy, as evidenced by the exceptionally high completeness of the benchmarked set of conserved single-copy orthologous genes (BUSCO completeness = 99.1%). These quality metrics place our assembly ahead of the published Coleopteran genomes, including that of an insect model, the red flour beetle (Tribolium castaneum). The structural annotation of the nuclear genome assembly contained a highly-accurate set of 16,371 protein-coding genes, with only 2.8% missing BUSCOs, and the expected number of non-coding RNAs. The number and structure of paralogous genes in a gene family like Sigma GST is lower than in another scarab beetle (Onthophagus taurus), but higher than in the red flour beetle (Tribolium castaneum), which suggests expansion of this GST class in Scarabaeidae. The quality of our gene models was also confirmed with the correct placement of O. rhinoceros among other members of the rhinoceros beetles (subfamily Dynastinae) in a phylogeny based on the sequences of 95 protein-coding genes in 373 beetle species from all major lineages of Coleoptera. Finally, we provide a list of 30 candidate dsRNA targets whose orthologs have been experimentally validated as highly effective targets for RNAi-based control of several beetles. Conclusions The genomic resources produced in this study form a foundation for further functional genetic research and management programs that may inform the control and surveillance of O. rhinoceros populations, and we demonstrate the efficacy of de novo genome assembly using long-read ONT data from a single field-caught insect.
引用
收藏
页数:15
相关论文
共 73 条
  • [61] RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies
    Stamatakis, Alexandros
    [J]. BIOINFORMATICS, 2014, 30 (09) : 1312 - 1313
  • [62] Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources
    Stanke, M
    Schöffmann, O
    Morgenstern, B
    Waack, S
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [63] Using native and syntenically mapped cDNA alignments to improve de novo gene finding
    Stanke, Mario
    Diekhans, Mark
    Baertsch, Robert
    Haussler, David
    [J]. BIOINFORMATICS, 2008, 24 (05) : 637 - 644
  • [64] Exploring systemic RNA interference in insects:: a genome-wide survey for RNAi genes in Tribolium
    Tomoyasu, Yoshinori
    Miller, Sherry C.
    Tomita, Shuichiro
    Schoppmeier, Michael
    Grossmann, Daniela
    Bucher, Gregor
    [J]. GENOME BIOLOGY, 2008, 9 (01)
  • [65] Tsatsia F., 2018, STATUS COCONUT RHINO
  • [66] Genome-Wide Patterns of Polymorphism in an Inbred Line of the African Malaria Mosquito Anopheles gambiae
    Turissini, David A.
    Gamez, Stephanie
    White, Bradley J.
    [J]. GENOME BIOLOGY AND EVOLUTION, 2014, 6 (11): : 3094 - 3104
  • [67] RNA Interference in Insects: Protecting Beneficials and Controlling Pests
    Vogel, Elise
    Santos, Dulce
    Mingels, Lina
    Verdonckt, Thomas-Wolf
    Vanden Broeck, Jozef
    [J]. FRONTIERS IN PHYSIOLOGY, 2019, 9
  • [68] Pilon: An Integrated Tool for Comprehensive Microbial Variant Detection and Genome Assembly Improvement
    Walker, Bruce J.
    Abeel, Thomas
    Shea, Terrance
    Priest, Margaret
    Abouelliel, Amr
    Sakthikumar, Sharadha
    Cuomo, Christina A.
    Zeng, Qiandong
    Wortman, Jennifer
    Young, Sarah K.
    Earl, Ashlee M.
    [J]. PLOS ONE, 2014, 9 (11):
  • [69] Identification of Genes That Result in High Mortality of Oryctes rhinoceros (Scarabaeidae: Coleoptera) When Targeted Using an RNA Interference Approach: Implications for Large Invasive Insects
    Watanabe, Shizu
    Adams, Brandi-Leigh
    Kong, Alexandra
    Masang, Nelson, Jr.
    Vowell, Tomie
    Melzer, Michael
    [J]. ANNALS OF THE ENTOMOLOGICAL SOCIETY OF AMERICA, 2020, 113 (04) : 310 - 317
  • [70] Waterhouse RM, 2019, METHODS MOL BIOL, V1858, P59, DOI 10.1007/978-1-4939-8775-7_6