An improved de novo genome assembly of the common marmoset genome yields improved contiguity and increased mapping rates of sequence data

被引:9
作者
Jayakumar, Vasanthan [1 ]
Ishii, Hiromi [1 ]
Seki, Misato [1 ]
Kumita, Wakako [2 ]
Inoue, Takashi [2 ]
Hase, Sumitaka [1 ]
Sato, Kengo [1 ]
Okano, Hideyuki [3 ,4 ]
Sasaki, Erika [2 ]
Sakakibara, Yasubumi [1 ]
机构
[1] Keio Univ, Dept Biosci & Informat, Yokohama, Kanagawa 2238522, Japan
[2] Cent Inst Expt Animals, Dept Marmoset Biol & Med, Kawasaki, Kanagawa 2100821, Japan
[3] Keio Univ, Dept Physiol, Sch Med, Shinjuku Ku, Tokyo 1608582, Japan
[4] RIKEN Ctr Brain Sci, Lab Marmoset Neural Architecture, Wako, Saitama 3510198, Japan
关键词
Common marmoset; Callithrix jacchus; De novo assembly; Non-human primate genomics; Chromosome-scale scaffolds; NEUROSCIENCE RESEARCH; PROVIDES INSIGHT; ALIGNMENT; ANNOTATION; MODEL; RNA;
D O I
10.1186/s12864-020-6657-2
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background The common marmoset (Callithrix jacchus) is one of the most studied primate model organisms. However, the marmoset genomes available in the public databases are highly fragmented and filled with sequence gaps, hindering research advances related to marmoset genomics and transcriptomics. Results Here we utilize single-molecule, long-read sequence data to improve and update the existing genome assembly and report a near-complete genome of the common marmoset. The assembly is of 2.79 Gb size, with a contig N50 length of 6.37 Mb and a chromosomal scaffold N50 length of 143.91 Mb, representing the most contiguous and high-quality marmoset genome up to date. Approximately 90% of the assembled genome was represented in contigs longer than 1 Mb, with approximately 104-fold improvement in contiguity over the previously published marmoset genome. More than 98% of the gaps from the previously published genomes were filled successfully, which improved the mapping rates of genomic and transcriptomic data on to the assembled genome. Conclusions Altogether the updated, high-quality common marmoset genome assembly provide improvements at various levels over the previous versions of the marmoset genome assemblies. This will allow researchers working on primate genomics to apply the genome more efficiently for their genomic and transcriptomic sequence data.
引用
收藏
页数:9
相关论文
共 50 条
[1]   RaGOO: fast and accurate reference-guided scaffolding of draft genomes [J].
Alonge, Michael ;
Soyk, Sebastian ;
Ramakrishnan, Srividya ;
Wang, Xingang ;
Goodwin, Sara ;
Sedlazeck, Fritz J. ;
Lippman, Zachary B. ;
Schatz, Michael C. .
GENOME BIOLOGY, 2019, 20 (01)
[2]   Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome [J].
Bickhart, Derek M. ;
Rosen, Benjamin D. ;
Koren, Sergey ;
Sayre, Brian L. ;
Hastie, Alex R. ;
Chan, Saki ;
Lee, Joyce ;
Lam, Ernest T. ;
Liachko, Ivan ;
Sullivan, Shawn T. ;
Burton, Joshua N. ;
Huson, Heather J. ;
Nystrom, John C. ;
Kelley, Christy M. ;
Hutchison, Jana L. ;
Zhou, Yang ;
Sun, Jiajie ;
Crisa, Alessandra ;
de Leon, F. Abel Ponce ;
Schwartz, John C. ;
Hammond, John A. ;
Waldbieser, Geoffrey C. ;
Schroeder, Steven G. ;
Liu, George E. ;
Dunham, Maitreya J. ;
Shendure, Jay ;
Sonstegard, Tad S. ;
Phillippy, Adam M. ;
Van Tassell, Curtis P. ;
Smith, Timothy P. L. .
NATURE GENETICS, 2017, 49 (04) :643-+
[3]   Fast and sensitive protein alignment using DIAMOND [J].
Buchfink, Benjamin ;
Xie, Chao ;
Huson, Daniel H. .
NATURE METHODS, 2015, 12 (01) :59-60
[4]   D-GENIES: dot plot large genomes in an interactive, efficient and simple way [J].
Cabanettes, Floreal ;
Klopp, Christophe .
PEERJ, 2018, 6
[5]  
Chen Nansheng, 2004, Curr Protoc Bioinformatics, VChapter 4, DOI 10.1002/0471250953.bi0410s05
[6]  
Chin CS, 2016, NAT METHODS, V13, P1050, DOI [10.1038/NMETH.4035, 10.1038/nmeth.4035]
[7]  
Chin CS, 2013, NAT METHODS, V10, P563, DOI [10.1038/NMETH.2474, 10.1038/nmeth.2474]
[8]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[9]   Mind the gaps: overlooking inaccessible regions confounds statistical testing in genome analysis [J].
Domanska, Diana ;
Kanduri, Chakravarthi ;
Simovski, Boris ;
Sandve, Geir Kjetil .
BMC BIOINFORMATICS, 2018, 19
[10]   Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity [J].
Edger, Patrick P. ;
VanBuren, Robert ;
Colle, Marivi ;
Poorten, Thomas J. ;
Wai, Ching Man ;
Niederhuth, Chad E. ;
Alger, Elizabeth I. ;
Ou, Shujun ;
Acharya, Charlotte B. ;
Wang, Jie ;
Callow, Pete ;
McKain, Michael R. ;
Shi, Jinghua ;
Collier, Chad ;
Xiong, Zhiyong ;
Mower, Jeffrey P. ;
Slovin, Janet P. ;
Hytonen, Timo ;
Jiang, Ning ;
Childs, Kevin L. ;
Knapp, Steven J. .
GIGASCIENCE, 2017, 7 (02)