De novo genome assembly and annotation of Australia's largest freshwater fish, the Murray cod (Maccullochella peelii) from Illumina and Nanopore sequencing reads

被引:53
作者
Austin, Christopher M. [1 ,2 ,3 ]
Tan, Mun Hua [1 ,2 ,3 ]
Harrisson, Katherine A. [4 ]
Lee, Yin Peng [2 ,3 ]
Croft, Laurence J. [3 ,5 ]
Sunnucks, Paul [4 ]
Pavlova, Alexandra [4 ]
Gan, Han Ming [1 ,2 ,3 ]
机构
[1] Deakin Univ, Sch Life & Environm Sci, Ctr Integrat Ecol, Waurn Ponds, Vic 3216, Australia
[2] Monash Univ Malaysia, Genom Facil, Trop Med & Biol Platform, Jalan Lagoon Selatan, Petaling Jaya 47500, Selangor, Malaysia
[3] Monash Univ Malaysia, Sch Sci, Jalan Lagoon Selatan, Petaling Jaya 47500, Selangor, Malaysia
[4] Monash Univ, Sch Biol Sci, Clayton Campus, Clayton, Vic, Australia
[5] Malaysian Genom Resource Ctr Berhad, Blvd Signature Off, Kuala Lumpur, Malaysia
基金
澳大利亚研究理事会;
关键词
Murray Cod; long reads; genome; transcriptome; hybrid assembly; EVOLUTION; EFFICIENT; ALIGNMENT; DATABASE; RANGE;
D O I
10.1093/gigascience/gix063
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: One of the most iconic Australian fish is the Murray cod, Maccullochella peelii (Mitchell, 1838), a freshwater species that can grow to (similar to)1.8 metres in length and live >= 48 years of age. The Murray cod is of conservation concern as a result of strong population contractions, but is also popular for recreational fishing and is of growing aquaculture interest. In this study, we report the whole genome sequence of the Murray cod to support ongoing population genetics, conservation and management-related research, as well as to understand better the evolutionary ecology and history of the species. Findings: A draft Murray cod genome of 633 Mbp (N-50=109,974bp; BUSCO and CEGMA completeness of 94.2% and 91.9%, respectively) with an estimated 148 Mbp of putative repetitive sequences was assembled from the combined sequencing data of two fish individuals with an identical maternal lineage. 47.2 Gb of Illumina HiSeq data and 804 Mb of Nanopore data were generated from the first individual while 23.2 Gb of Illumina MiSeq data were from the second individual. The inclusion of Nanopore reads for scaffolding followed by subsequent gap-closing using Illumina data led to a 29% reduction in the number of scaffolds and a 55% and 54% increase in the scaffold and contig N50, respectively. We also report the first transcriptome of Murray cod that was subsequently used to annotate the Murray cod genome leading to the identification of 26,539 protein-coding genes. Conclusions: We present the whole genome of the Murray cod and anticipate this will be a catalyst for a range of genetic, genomic and phylogenetic studies of the Murray cod and more generally other fish species of Percichthydae family.
引用
收藏
页数:19
相关论文
共 52 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
[Anonymous], EVOLUTIONARY APPL
[3]  
[Anonymous], MITOCHONDRIAL DNA A
[4]  
[Anonymous], REV AQUACULTURE
[5]  
[Anonymous], GENOME RES
[6]  
[Anonymous], NOVO GENOME ASSEMBLY
[7]  
[Anonymous], NUCLEIC ACIDS RES
[8]  
[Anonymous], HEREDITY
[9]  
[Anonymous], SEMI HMM BASED NUCL
[10]  
[Anonymous], BIORXIV