An improved genome release (version Mt4.0) for the model legume Medicago truncatula

被引:296
作者
Tang, Haibao [1 ]
Krishnakumar, Vivek [1 ]
Bidwell, Shelby [1 ]
Rosen, Benjamin [1 ]
Chan, Agnes [1 ]
Zhou, Shiguo [2 ]
Gentzbittel, Laurent [3 ]
Childs, Kevin L. [4 ]
Yandell, Mark [5 ]
Gundlach, Heidrun [6 ]
Mayer, Klaus F. X. [6 ]
Schwartz, David C. [2 ]
Town, Christopher D. [1 ]
机构
[1] J Craig Venter Inst, Rockville, MD 20850 USA
[2] Univ Wisconsin, Dept Chem, Lab Mol & Computat Genom, Madison, WI 53706 USA
[3] Univ Toulouse, INP ENSAT, CNRS, Lab Ecol Fonct & Environm, Toulouse, France
[4] Michigan State Univ, Dept Plant Biol, E Lansing, MI 48824 USA
[5] Univ Utah, Dept Human Genet, Salt Lake City, UT USA
[6] German Res Ctr Environm Hlth GmbH, Helmholtz Ctr Munich, MIPS IBIS Inst Bioinformat & Syst Biol, Neuherberg, Germany
基金
美国国家科学基金会;
关键词
Medicago; Legume; Genome assembly; Gene annotation; Optical map; RNA-SEQ DATA; RICE GENOME; ANNOTATION; SEQUENCE; ALIGNMENT; GENES; ASSEMBLIES; DISCOVERY; EVOLUTION; PIPELINE;
D O I
10.1186/1471-2164-15-312
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Medicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011. Results: Here we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass similar to 360 Mb of actual sequences spanning 390 Mb of which similar to 330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (similar to 250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of similar to 28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with similar to 82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an " unsupported" status and 4% are absent from the Mt4.0 predictions. Conclusions: Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www. jcvi. org/medicago). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.
引用
收藏
页数:14
相关论文
共 42 条
[1]   Natural diversity in the model legume Medicago truncatula allows identifying distinct genetic mechanisms conferring partial resistance to Verticillium wilt [J].
Ben, Cecile ;
Toueni, Maoulida ;
Montanari, Sara ;
Tardin, Marie-Claire ;
Fervel, Magalie ;
Negahi, Azam ;
Saint-Pierre, Laure ;
Mathieu, Guillaume ;
Gras, Marie-Christine ;
Noel, Dominique ;
Prosperi, Jean-Marie ;
Pilet-Nayel, Marie-Laure ;
Baranger, Alain ;
Huguet, Thierry ;
Julier, Bernadette ;
Rickauer, Martina ;
Gentzbittel, Laurent .
JOURNAL OF EXPERIMENTAL BOTANY, 2013, 64 (01) :317-332
[2]   Nuclear DNA amounts in angiosperms: targets, trends and tomorrow [J].
Bennett, M. D. ;
Leitch, I. J. .
ANNALS OF BOTANY, 2011, 107 (03) :467-590
[3]   Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes [J].
Cannon, Steven B. ;
Sterck, Lieven ;
Rombauts, Stephane ;
Sato, Shusei ;
Cheung, Foo ;
Gouzy, Jerome ;
Wang, Xiaohong ;
Mudge, Joann ;
Vasdewani, Jayprakash ;
Scheix, Thomas ;
Spannagl, Manuel ;
Monaghan, Erin ;
Nicholson, Christine ;
Humphray, Sean J. ;
Schoof, Heiko ;
Mayer, Klaus F. X. ;
Rogers, Jane ;
Quetier, Francis ;
Oldroyd, Giles E. ;
Debelle, Frederic ;
Cook, Douglas R. ;
Retzel, Ernest F. ;
Roe, Bruce A. ;
Town, Christopher D. ;
Tabata, Satoshi ;
Van de Peer, Yves ;
Young, Nevin D. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (40) :14959-14964
[4]   MAKER: An easy-to-use annotation pipeline designed for emerging model organism genomes [J].
Cantarel, Brandi L. ;
Korf, Ian ;
Robb, Sofia M. C. ;
Parra, Genis ;
Ross, Eric ;
Moore, Barry ;
Holt, Carson ;
Alvarado, Alejandro Sanchez ;
Yandell, Mark .
GENOME RESEARCH, 2008, 18 (01) :188-196
[5]   A Robust, Simple Genotyping-by-Sequencing (GBS) Approach for High Diversity Species [J].
Elshire, Robert J. ;
Glaubitz, Jeffrey C. ;
Sun, Qi ;
Poland, Jesse A. ;
Kawamoto, Ken ;
Buckler, Edward S. ;
Mitchell, Sharon E. .
PLOS ONE, 2011, 6 (05)
[6]   Genome annotation in plants and fungi:: EuGene as a model platform [J].
Foissac, Sylvain ;
Gouzy, Jerome ;
Rombauts, Stephane ;
Mathe, Catherine ;
Amselem, Joelle ;
Sterck, Lieven ;
Van de Peer, Yves ;
Rouze, Pierre ;
Schiex, Thomas .
CURRENT BIOINFORMATICS, 2008, 3 (02) :87-97
[7]   High-quality draft assemblies of mammalian genomes from massively parallel sequence data [J].
Gnerre, Sante ;
MacCallum, Iain ;
Przybylski, Dariusz ;
Ribeiro, Filipe J. ;
Burton, Joshua N. ;
Walker, Bruce J. ;
Sharpe, Ted ;
Hall, Giles ;
Shea, Terrance P. ;
Sykes, Sean ;
Berlin, Aaron M. ;
Aird, Daniel ;
Costello, Maura ;
Daza, Riza ;
Williams, Louise ;
Nicol, Robert ;
Gnirke, Andreas ;
Nusbaum, Chad ;
Lander, Eric S. ;
Jaffe, David B. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (04) :1513-1518
[8]   Full-length transcriptome assembly from RNA-Seq data without a reference genome [J].
Grabherr, Manfred G. ;
Haas, Brian J. ;
Yassour, Moran ;
Levin, Joshua Z. ;
Thompson, Dawn A. ;
Amit, Ido ;
Adiconis, Xian ;
Fan, Lin ;
Raychowdhury, Raktima ;
Zeng, Qiandong ;
Chen, Zehua ;
Mauceli, Evan ;
Hacohen, Nir ;
Gnirke, Andreas ;
Rhind, Nicholas ;
di Palma, Federica ;
Birren, Bruce W. ;
Nusbaum, Chad ;
Lindblad-Toh, Kerstin ;
Friedman, Nir ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2011, 29 (07) :644-U130
[9]   Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies [J].
Haas, BJ ;
Delcher, AL ;
Mount, SM ;
Wortman, JR ;
Smith, RK ;
Hannick, LI ;
Maiti, R ;
Ronning, CM ;
Rusch, DB ;
Town, CD ;
Salzberg, SL ;
White, O .
NUCLEIC ACIDS RESEARCH, 2003, 31 (19) :5654-5666
[10]   Automated eukaryotic gene structure annotation using EVidenceModeler and the program to assemble spliced alignments [J].
Haas, Brian J. ;
Salzberg, Steven L. ;
Zhu, Wei ;
Pertea, Mihaela ;
Allen, Jonathan E. ;
Orvis, Joshua ;
White, Owen ;
Buell, C. Robin ;
Wortman, Jennifer R. .
GENOME BIOLOGY, 2008, 9 (01)