High-quality chromosome-scale assembly of the walnut (Juglans regia L.) reference genome

被引:96
作者
Marrano, Annarita [1 ]
Britton, Monica [2 ]
Zaini, Paulo A. [1 ]
Zimin, Aleksey V. [3 ,4 ]
Workman, Rachael E. [3 ]
Puiu, Daniela [4 ]
Bianco, Luca [5 ]
Di Pierro, Erica Adele [5 ]
Allen, Brian J. [1 ]
Chakraborty, Sandeep [1 ]
Troggio, Michela [5 ]
Leslie, Charles A. [1 ]
Timp, Winston [3 ,4 ]
Dandekar, Abhaya [1 ]
Salzberg, Steven L. [3 ,4 ,6 ,7 ]
Neale, David B. [1 ]
机构
[1] Univ Calif Davis, Dept Plant Sci, One Shields Ave, Davis, CA 95616 USA
[2] Univ Calif Davis, Genome Ctr, Bioinformat Core Facil, One Shields Ave, Davis, CA 95616 USA
[3] Johns Hopkins Univ, Dept Biomed Engn, 720 Rutland Ave, Baltimore, MD 21205 USA
[4] Johns Hopkins Univ, Whiting Sch Engn, Ctr Computat Biol, 3100 Wyman Pk Dr, Baltimore, MD 21211 USA
[5] Fdn Edmund Mach, Res & Innovat Ctr, Via E Mach 1, I-38010 San Michele All Adige, TN, Italy
[6] Johns Hopkins Univ, Dept Comp Sci, 3400 North Charles St, Baltimore, MD 21218 USA
[7] Johns Hopkins Univ, Dept Biostat, 3400 North Charles St, Baltimore, MD 21218 USA
关键词
Nanopore; Hi-C; Iso-Seq; gene prediction; genetic diversity; proteome; allergens; ALIGNMENT; RNA; ANNOTATION; READS; DNA; IDENTIFICATION; COMPLETENESS; INTERPROSCAN; GENERATION; DIVERSITY;
D O I
10.1093/gigascience/giaa050
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The release of the first reference genome of walnut (Juglans regia L.) enabled many achievements in the characterization of walnut genetic and functional variation. However, it is highly fragmented, preventing the integration of genetic, transcriptomic, and proteomic information to fully elucidate walnut biological processes. Findings: Here, we report the new chromosome-scale assembly of the walnut reference genome (Chandler v2.0) obtained by combining Oxford Nanopore long-read sequencing with chromosome conformation capture (Hi-C) technology. Relative to the previous reference genome, the new assembly features an 84.4-fold increase in N50 size, with the 16 chromosomal pseudomolecules assembled and representing 95% of its total length. Using full-length transcripts from single-molecule real-time sequencing, we predicted 37,554 gene models, with a mean gene length higher than the previous gene annotations. Most of the new protein-coding genes (90%) present both start and stop codons, which represents a significant improvement compared with Chandler v1.0 (only 48%). We then tested the potential impact of the new chromosome-level genome on different areas of walnut research. By studying the proteome changes occurring during male flower development, we observed that the virtual proteome obtained from Chandler v2.0 presents fewer artifacts than the previous reference genome, enabling the identification of a new potential pollen allergen in walnut. Also, the new chromosome-scale genome facilitates in-depth studies of intraspecies genetic diversity by revealing previously undetected autozygous regions in Chandler, likely resulting from inbreeding, and 195 genomic regions highly differentiated between Western and Eastern walnut cultivars. Conclusion: Overall, Chandler v2.0 will serve as a valuable resource to better understand and explore walnut biology.
引用
收藏
页数:16
相关论文
共 96 条
[1]  
Alexa A., GENE SET ENRICHMENT
[2]  
[Anonymous], 1975, WAGENINGEN NATURA MO
[3]   Genome-wide patterns of population structure and association mapping of nut-related traits in Persian walnut populations from Iran using the Axiom J. regia 700K SNP array [J].
Arab, Mohammad Mehdi ;
Marrano, Annarita ;
Abdollahi-Arpanahi, Rostam ;
Leslie, Charles A. ;
Askari, Hossein ;
Neale, David B. ;
Vahdati, Kourosh .
SCIENTIFIC REPORTS, 2019, 9 (1)
[4]   Genetic Diversity, Structure and Differentiation in Cultivated Walnut (Juglans regia L.) [J].
Aradhya, M. ;
Woeste, K. ;
Velasco, D. .
VI INTERNATIONAL WALNUT SYMPOSIUM, 2010, 861 :127-132
[5]   The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling [J].
Arnold, K ;
Bordoli, L ;
Kopp, J ;
Schwede, T .
BIOINFORMATICS, 2006, 22 (02) :195-201
[6]   Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps [J].
Belser, Caroline ;
Istace, Benjamin ;
Denis, Erwan ;
Dubarry, Marion ;
Baurens, Franc-Christophe ;
Falentin, Cyril ;
Genete, Mathieu ;
Berrabah, Wahiba ;
Chevre, Anne-Marie ;
Delourme, Regine ;
Deniot, Gwenaelle ;
Denoeud, France ;
Duffe, Philippe ;
Engelen, Stefan ;
Lemainque, Arnaud ;
Manzanares-Dauleux, Maria ;
Martin, Guillaume ;
Morice, Jerome ;
Noel, Benjamin ;
Vekemans, Xavier ;
D'Hont, Angelique ;
Rousseau-Gueutin, Mathieu ;
Barbe, Valerie ;
Cruaud, Corinne ;
Wincker, Patrick ;
Aury, Jean-Marc .
NATURE PLANTS, 2018, 4 (11) :879-+
[7]   Hi-C: A comprehensive technique to capture the conformation of genomes [J].
Belton, Jon-Matthew ;
McCord, Rachel Patton ;
Gibcus, Johan Harmen ;
Naumova, Natalia ;
Zhan, Ye ;
Dekker, Job .
METHODS, 2012, 58 (03) :268-276
[8]   Tandem repeats finder: a program to analyze DNA sequences [J].
Benson, G .
NUCLEIC ACIDS RESEARCH, 1999, 27 (02) :573-580
[9]   Association and linkage mapping to unravel genetic architecture of phenological traits and lateral bearing in Persian walnut (Juglans regia L.) [J].
Bernard, Anthony ;
Marrano, Annarita ;
Donkpegan, Armel ;
Brown, Patrick J. ;
Leslie, Charles A. ;
Neale, David B. ;
Lheureux, Fabrice ;
Dirlewanger, Elisabeth .
BMC GENOMICS, 2020, 21 (01)
[10]   Analysis of genetic diversity and structure in a worldwide walnut (Juglans regia L.) germplasm using SSR markers [J].
Bernard, Anthony ;
Barreneche, Teresa ;
Lheureux, Fabrice ;
Dirlewanger, Elisabeth .
PLOS ONE, 2018, 13 (11)