Haplotype-resolved chromosome-level genome assembly of Ehretia macrophylla

被引:1
作者
Cheng, Shiping [1 ]
Zhang, Qikun [2 ]
Geng, Xining [1 ]
Xie, Lihua [1 ]
Chen, Minghui [1 ]
Jiao, Siqian [1 ]
Qi, Shuaizheng [1 ]
Yao, Pengqiang [1 ]
Lu, Mailin [3 ]
Zhang, Mengren [3 ]
Zhai, Wenshan [4 ]
Yun, Quanzheng [5 ]
Feng, Shangguo [6 ,7 ]
机构
[1] Pingdingshan Univ, Henan Prov Key Lab Germplasm Innovat & Utilizat Ec, Pingdingshan 467000, Peoples R China
[2] Kaitai Bio Co, Hangzhou 310000, Peoples R China
[3] Henan Forestry Vocat Coll, Luoyang 471000, Peoples R China
[4] Henan Senzhuang Cukang Agr & Forestry Technol Co L, Luoyang 471000, Peoples R China
[5] Kaitai Mingjing Genetech Corp, Beijing 100070, Peoples R China
[6] Hangzhou Normal Univ, Coll Life & Environm Sci, Hangzhou 310036, Peoples R China
[7] Hangzhou Normal Univ, Zhejiang Prov Key Lab Genet Improvement & Qual Con, Hangzhou 310036, Peoples R China
关键词
PROVIDES; ANNOTATION; ALIGNMENT; SEQUENCE; SYSTEM;
D O I
10.1038/s41597-024-03431-9
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Ehretia macrophylla Wall, known as wild loquat, is an ecologically, economically, and medicinally significant tree species widely grown in China, Japan, Vietnam, and Nepal. In this study, we have successfully generated a haplotype-resolved chromosome-scale genome assembly of E. macrophylla by integrating PacBio HiFi long-reads, Illumina short-reads, and Hi-C data. The genome assembly consists of two haplotypes, with sizes of 1.82 Gb and 1.58 Gb respectively, and contig N50 lengths of 28.11 Mb and 21.57 Mb correspondingly. Additionally, 99.41% of the assembly was successfully anchored into 40 pseudo-chromosomes. We predicted 58,886 protein-coding genes, of which 99.60% were functionally annotated from databases. We furthermore detected 2.65 Gb repeat sequences, 659,290 rRNAs, 4,931 tRNAs and 4,688 other ncRNAs. The high-quality assembly of the genome offers a solid basis for furthering the fields of molecular breeding and functional genomics of E. macrophylla.
引用
收藏
页数:13
相关论文
共 58 条
[1]  
[Anonymous], 2024, NCBI GenBank
[2]   UniProt: the universal protein knowledgebase in 2021 [J].
Bateman, Alex ;
Martin, Maria-Jesus ;
Orchard, Sandra ;
Magrane, Michele ;
Agivetova, Rahat ;
Ahmad, Shadab ;
Alpi, Emanuele ;
Bowler-Barnett, Emily H. ;
Britto, Ramona ;
Bursteinas, Borisas ;
Bye-A-Jee, Hema ;
Coetzee, Ray ;
Cukura, Austra ;
Da Silva, Alan ;
Denny, Paul ;
Dogan, Tunca ;
Ebenezer, ThankGod ;
Fan, Jun ;
Castro, Leyla Garcia ;
Garmiri, Penelope ;
Georghiou, George ;
Gonzales, Leonardo ;
Hatton-Ellis, Emma ;
Hussein, Abdulrahman ;
Ignatchenko, Alexandr ;
Insana, Giuseppe ;
Ishtiaq, Rizwan ;
Jokinen, Petteri ;
Joshi, Vishal ;
Jyothi, Dushyanth ;
Lock, Antonia ;
Lopez, Rodrigo ;
Luciani, Aurelien ;
Luo, Jie ;
Lussi, Yvonne ;
Mac-Dougall, Alistair ;
Madeira, Fabio ;
Mahmoudy, Mahdi ;
Menchi, Manuela ;
Mishra, Alok ;
Moulang, Katie ;
Nightingale, Andrew ;
Oliveira, Carla Susana ;
Pundir, Sangya ;
Qi, Guoying ;
Raj, Shriya ;
Rice, Daniel ;
Lopez, Milagros Rodriguez ;
Saidi, Rabie ;
Sampson, Joseph .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D480-D489
[3]   Fast and sensitive protein alignment using DIAMOND [J].
Buchfink, Benjamin ;
Xie, Chao ;
Huson, Daniel H. .
NATURE METHODS, 2015, 12 (01) :59-60
[4]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10
[5]   MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes [J].
Cantarel, Brandi L. ;
Korf, Ian ;
Robb, Sofia M. C. ;
Parra, Genis ;
Ross, Eric ;
Moore, Barry ;
Holt, Carson ;
Alvarado, Alejandro Sanchez ;
Yandell, Mark .
GENOME RESEARCH, 2008, 18 (01) :188-196
[6]   tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes [J].
Chan, Patricia P. ;
Lin, Brian Y. ;
Mak, Allysia J. ;
Lowe, Todd M. .
NUCLEIC ACIDS RESEARCH, 2021, 49 (16) :9077-9096
[7]   fastp: an ultra-fast all-in-one FASTQ preprocessor [J].
Chen, Shifu ;
Zhou, Yanqing ;
Chen, Yaru ;
Gu, Jia .
BIOINFORMATICS, 2018, 34 (17) :884-890
[8]   Araport11: a complete reannotation of the Arabidopsis thaliana reference genome [J].
Cheng, Chia-Yi ;
Krishnakumar, Vivek ;
Chan, Agnes P. ;
Thibaud-Nissen, Francoise ;
Schobel, Seth ;
Town, Christopher D. .
PLANT JOURNAL, 2017, 89 (04) :789-804
[9]   Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm [J].
Cheng, Haoyu ;
Concepcion, Gregory T. ;
Feng, Xiaowen ;
Zhang, Haowen ;
Li, Heng .
NATURE METHODS, 2021, 18 (02) :170-+
[10]   Phenolic profiles, antioxidant, antiproliferative, and hypoglycemic activities ofEhretia macrophylaWall. (EMW) fruit [J].
Deng, Na ;
Zheng, Bisheng ;
Li, Tong ;
Hu, Xiaodan ;
Liu, Rui Hai .
JOURNAL OF FOOD SCIENCE, 2020, 85 (07) :2177-2185