The first high-quality chromosome-level genome of the Sipuncula Sipunculus nudus using HiFi and Hi-C data

被引:6
作者
Zheng, Zhe [1 ,2 ]
Lai, Zhuoxin [1 ]
Wu, Bin [3 ]
Song, Xinlin [1 ]
Zhao, Wei [3 ]
Zhong, Ruzhuo [1 ]
Zhang, Jiawei [1 ]
Liao, Yongshan [1 ,2 ]
Yang, Chuangye [1 ,2 ]
Deng, Yuewen [1 ,2 ]
Mei, Junpu [3 ,4 ]
Yue, Zhen [4 ]
Jian, Jianbo [3 ]
Wang, Qingheng [1 ,2 ]
机构
[1] Guangdong Ocean Univ, Fisheries Coll, Zhanjiang 524088, Guangdong, Peoples R China
[2] Guangdong Prov Key Lab Aquat Anim Dis Control & Hl, Zhanjiang 524088, Guangdong, Peoples R China
[3] BGI Shenzhen, Shenzhen 518083, Guangdong, Peoples R China
[4] BGI Shenzhen, BGI Sanya, Sanya 572025, Hainan, Peoples R China
关键词
WEB-BASED TOOL; SEQUENCE ALIGNMENT; SUPPLEMENT TREMBL; GENE ONTOLOGY; ANNOTATION; IDENTIFICATION; PREDICTION; FAMILIES; RESOURCE; SYSTEM;
D O I
10.1038/s41597-023-02235-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Sipuncula is a class of exocoelomic unsegmented animals whose evolutionary relationships are unresolved. The peanut worm Sipunculus nudus is a globally distributed, economically important species belonging to the class Sipuncula. Herein, we present the first high-quality chromosome-level assembly of S. nudus based on HiFi reads and high-resolution chromosome conformation capture (Hi-C) data. The assembled genome was 1,427 Mb, with a contig N50 length of 29.46 Mb and scaffold N50 length of 80.87 Mb. Approximately 97.91% of the genome sequence was anchored to 17 chromosomes. A BUSCO assessment showed that 97.7% of the expectedly conserved genes were present in the genome assembly. The genome was composed of 47.91% repetitive sequences, and 28,749 protein-coding genes were predicted. A phylogenetic tree demonstrated that Sipuncula belongs to Annelida and diverged from the common ancestor of Polychaeta. The high-quality chromosome-level genome of S. nudus will serve as a valuable reference for studies of the genetic diversity and evolution of Lophotrochozoa.
引用
收藏
页数:13
相关论文
共 55 条
[1]  
[Anonymous], 2022, NCBI Sequence Read Archive
[2]  
[Anonymous], 2000, NUCLEIC ACIDS RES
[3]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[4]   PRINTS-S: the database formerly known as PRINTS [J].
Attwood, TK ;
Croning, MDR ;
Flower, DR ;
Lewis, AP ;
Mabey, JE ;
Scordis, P ;
Selley, JN ;
Wright, W .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :225-227
[5]   The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000 [J].
Bairoch, A ;
Apweiler, R .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :45-48
[6]   The Pfam protein families database [J].
Bateman, A ;
Birney, E ;
Durbin, R ;
Eddy, SR ;
Howe, KL ;
Sonnhammer, ELL .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :263-266
[7]   Hi-C: A comprehensive technique to capture the conformation of genomes [J].
Belton, Jon-Matthew ;
McCord, Rachel Patton ;
Gibcus, Johan Harmen ;
Naumova, Natalia ;
Zhan, Ye ;
Dekker, Job .
METHODS, 2012, 58 (03) :268-276
[8]   Discovering and detecting transposable elements in genome sequences [J].
Bergman, Casey M. ;
Quesneville, Hadi .
BRIEFINGS IN BIOINFORMATICS, 2007, 8 (06) :382-392
[9]   The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003 [J].
Boeckmann, B ;
Bairoch, A ;
Apweiler, R ;
Blatter, MC ;
Estreicher, A ;
Gasteiger, E ;
Martin, MJ ;
Michoud, K ;
O'Donovan, C ;
Phan, I ;
Pilbout, S ;
Schneider, M .
NUCLEIC ACIDS RESEARCH, 2003, 31 (01) :365-370
[10]   Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm [J].
Cheng, Haoyu ;
Concepcion, Gregory T. ;
Feng, Xiaowen ;
Zhang, Haowen ;
Li, Heng .
NATURE METHODS, 2021, 18 (02) :170-+