Hybrid de novo genome assembly of the Chinese herbal fleabane Erigeron breviscapus

被引:23
作者
Yang, Jing [1 ]
Zhang, Guanghui [2 ]
Zhang, Jing [3 ]
Liu, Hui [4 ,5 ]
Chen, Wei [1 ,6 ]
Wang, Xiao [4 ,5 ]
Li, Yahe [7 ]
Dong, Yang [1 ,6 ,8 ]
Yang, Shengchao [2 ]
机构
[1] Yunnan Agr Univ, Biol Big Data Coll, Kunming 650201, Peoples R China
[2] Yunnan Agr Univ, Natl Local Joint Engn Res Ctr Germplasm Utilizat, Kunming 650201, Peoples R China
[3] NOWBIO Technol Co Ltd, Kunming 650202, Peoples R China
[4] Chinese Acad Sci, Kunming Inst Zool, State Key Lab Genet Resources & Evolut, Kunming 650223, Peoples R China
[5] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[6] Yunnan Res Inst Local Plateau Agr & Ind, Kunming 650201, Peoples R China
[7] Longjin Pharmaceut Co Ltd, Kunming 650228, Peoples R China
[8] Kunming Univ Sci & Technol, Coll Life Sci, Kunming 650500, Peoples R China
基金
中国国家自然科学基金;
关键词
Erigeron breviscapus; Illumina sequencing; PacBio sequencing; GENE; PREDICTION; ASTERACEAE; FRAMEWORK; PROGRAM; FINDER; TOOL;
D O I
10.1093/gigascience/gix028
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The plants in the Erigeron genus of the Compositae (Asteraceae) family are commonly called fleabanes, possibly due to the belief that certain chemicals in these plants repel fleas. In the traditional Chinese medicine, Erigeron breviscapus, which is native to China, was widely used in the treatment of cerebrovascular disease. A handful of bioactive compounds, including scutellarin, 3,5-dicaffeoylquinic acid, and 3,4-dicaffeoylquinic acid, have been isolated from the plant. With the purpose of finding novel medicinal compounds and understanding their biosynthetic pathways, we propose to sequence the genome of E. breviscapus. Findings: We assembled the highly heterozygous E. breviscapus genome using a combination of PacBio single-molecular real-time sequencing and next-generation sequencing methods on the Illumina HiSeq platform. The final draft genome is approximately 1.2 Gb, with contig and scaffold N50 sizes of 18.8 kb and 31.5 kb, respectively. Further analyses predicted 37 504 protein-coding genes in the E. breviscapus genome and 8172 shared gene families among Compositae species. Conclusions: The E. breviscapus genome provides a valuable resource for the investigation of novel bioactive compounds in this Chinese herb.
引用
收藏
页码:1 / 17
页数:7
相关论文
共 42 条
[1]   HTSeq-a Python']Python framework to work with high-throughput sequencing data [J].
Anders, Simon ;
Pyl, Paul Theodor ;
Huber, Wolfgang .
BIOINFORMATICS, 2015, 31 (02) :166-169
[2]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[3]   Using GeneWise in the Drosophila annotation experiment [J].
Birney, E ;
Durbin, R .
GENOME RESEARCH, 2000, 10 (04) :547-548
[4]  
Cai Y, 2014, BIOMED RES INT, V2014
[5]  
Chen Nansheng, 2004, Curr Protoc Bioinformatics, VChapter 4, DOI 10.1002/0471250953.bi0410s05
[6]   CAFE: a computational tool for the study of gene family evolution [J].
De Bie, T ;
Cristianini, N ;
Demuth, JP ;
Hahn, MW .
BIOINFORMATICS, 2006, 22 (10) :1269-1271
[7]   A framework for variation discovery and genotyping using next-generation DNA sequencing data [J].
DePristo, Mark A. ;
Banks, Eric ;
Poplin, Ryan ;
Garimella, Kiran V. ;
Maguire, Jared R. ;
Hartl, Christopher ;
Philippakis, Anthony A. ;
del Angel, Guillermo ;
Rivas, Manuel A. ;
Hanna, Matt ;
McKenna, Aaron ;
Fennell, Tim J. ;
Kernytsky, Andrew M. ;
Sivachenko, Andrey Y. ;
Cibulskis, Kristian ;
Gabriel, Stacey B. ;
Altshuler, David ;
Daly, Mark J. .
NATURE GENETICS, 2011, 43 (05) :491-+
[8]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[9]   Real-Time DNA Sequencing from Single Polymerase Molecules [J].
Eid, John ;
Fehr, Adrian ;
Gray, Jeremy ;
Luong, Khai ;
Lyle, John ;
Otto, Geoff ;
Peluso, Paul ;
Rank, David ;
Baybayan, Primo ;
Bettman, Brad ;
Bibillo, Arkadiusz ;
Bjornson, Keith ;
Chaudhuri, Bidhan ;
Christians, Frederick ;
Cicero, Ronald ;
Clark, Sonya ;
Dalal, Ravindra ;
deWinter, Alex ;
Dixon, John ;
Foquet, Mathieu ;
Gaertner, Alfred ;
Hardenbol, Paul ;
Heiner, Cheryl ;
Hester, Kevin ;
Holden, David ;
Kearns, Gregory ;
Kong, Xiangxu ;
Kuse, Ronald ;
Lacroix, Yves ;
Lin, Steven ;
Lundquist, Paul ;
Ma, Congcong ;
Marks, Patrick ;
Maxham, Mark ;
Murphy, Devon ;
Park, Insil ;
Pham, Thang ;
Phillips, Michael ;
Roy, Joy ;
Sebra, Robert ;
Shen, Gene ;
Sorenson, Jon ;
Tomaney, Austin ;
Travers, Kevin ;
Trulson, Mark ;
Vieceli, John ;
Wegener, Jeffrey ;
Wu, Dawn ;
Yang, Alicia ;
Zaccarin, Denis .
SCIENCE, 2009, 323 (5910) :133-138
[10]   Opera: Reconstructing Optimal Genomic Scaffolds with High-Throughput Paired-End Sequences [J].
Gao, Song ;
Sung, Wing-Kin ;
Nagarajan, Niranjan .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2011, 18 (11) :1681-1691