Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum

被引:221
|
作者
VanBuren, Robert [1 ]
Bryant, Doug [1 ]
Edger, Patrick P. [2 ,3 ]
Tang, Haibao [4 ,5 ]
Burgess, Diane [2 ]
Challabathula, Dinakar [6 ]
Spittle, Kristi [7 ]
Hall, Richard [7 ]
Gu, Jenny [7 ]
Lyons, Eric [4 ]
Freeling, Michael [2 ]
Bartels, Dorothea [6 ]
Ten Hallers, Boudewijn [8 ]
Hastie, Alex [8 ]
Michael, Todd P. [9 ]
Mockler, Todd C. [1 ]
机构
[1] Donald Danforth Plant Sci Ctr, St Louis, MO 63132 USA
[2] Univ Calif Berkeley, Dept Plant & Microbial Biol, Berkeley, CA 94720 USA
[3] Michigan State Univ, Dept Hort, E Lansing, MI 48323 USA
[4] Univ Arizona, Sch Plant Sci, IPlant Collaborat, Tucson, AZ 85721 USA
[5] Fujian Agr & Forestry Univ, HIST, Ctr Genom & Biotechnol, Fuzhou 350002, Peoples R China
[6] Univ Bonn, IMBIO, D-53115 Bonn, Germany
[7] Pacific Biosci, Menlo Pk, CA 94025 USA
[8] BioNano Genom, San Diego, CA 92121 USA
[9] Ibis Biosci, Carlsbad, CA 92008 USA
基金
美国国家科学基金会;
关键词
STRUCTURAL VARIATION; GENOME COMPARISONS; TANDEM REPEATS; DNA; REVEALS; GENE; SIZE; IDENTIFICATION; TRANSCRIPTOME; COMPLEXITY;
D O I
10.1038/nature15714
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly(1). The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE)(2). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a 'near-complete' draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.
引用
收藏
页码:508 / U209
页数:16
相关论文
共 50 条
  • [21] Widespread Polycistronic Transcripts in Fungi Revealed by Single-Molecule mRNA Sequencing
    Gordon, Sean P.
    Tseng, Elizabeth
    Salamov, Asaf
    Zhang, Jiwei
    Meng, Xiandong
    Zhao, Zhiying
    Kang, Dongwan
    Underwood, Jason
    Grigoriev, Igor V.
    Figueroa, Melania
    Schilling, Jonathan S.
    Chen, Feng
    Wang, Zhong
    PLOS ONE, 2015, 10 (07):
  • [22] Reviving the Transcriptome Studies: An Insight Into the Emergence of Single-Molecule Transcriptome Sequencing
    Wang, Bo
    Kumar, Vivek
    Olson, Andrew
    Ware, Doreen
    FRONTIERS IN GENETICS, 2019, 10
  • [23] Critical assessment of bioinformatics methods for the characterization of pathological repeat expansions with single-molecule sequencing data
    Chiara, Matteo
    Zambelli, Federico
    Picardi, Ernesto
    Horner, David S.
    Pesole, Graziano
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (06) : 1971 - 1986
  • [24] A survey of transcriptome complexity in Sus scrofa using single-molecule long-read sequencing
    Li, Yao
    Fang, Chengchi
    Fu, Yuhua
    Hu, An
    Li, Cencen
    Zou, Cheng
    Li, Xinyun
    Zhao, Shuhong
    Zhang, Chengjun
    Li, Changchun
    DNA RESEARCH, 2018, 25 (04) : 421 - 437
  • [25] A global survey of the transcriptome of allopolyploid Brassica napus based on single-molecule long-read isoform sequencing and Illumina-based RNA sequencing data
    Yao, Shengli
    Liang, Fan
    Gill, Rafaqat Ali
    Huang, Junyan
    Cheng, Xiaohui
    Liu, Yueying
    Tong, Chaobo
    Liu, Shengyi
    PLANT JOURNAL, 2020, 103 (02) : 843 - 857
  • [26] Combination analysis of single-molecule long-read and Illumina sequencing provides insights into the anthocyanin accumulation mechanism in an ornamental grass, Pennisetum setaceum cv. Rubrum
    Liu, Lingyun
    Teng, Ke
    Fan, Xifeng
    Han, Chao
    Zhang, Hui
    Wu, Juying
    Chang, Zhihui
    PLANT MOLECULAR BIOLOGY, 2022, 109 (1-2) : 159 - 175
  • [27] Detecting AGG Interruptions in Male and Female FMR1 Premutation Carriers by Single-Molecule Sequencing
    Ardui, Simon
    Race, Valerie
    Zablotskaya, Alena
    Hestand, Matthew S.
    Van Esch, Hilde
    Devriendt, Koenraad
    Matthijs, Gert
    Vermeesch, Joris R.
    HUMAN MUTATION, 2017, 38 (03) : 324 - 331
  • [28] Single-molecule long-read sequencing reveals a conserved intact long RNA profile in sperm
    Sun, Yu H.
    Wang, Anqi
    Song, Chi
    Shankar, Goutham
    Srivastava, Rajesh K.
    Au, Kin Fai
    Li, Xin Zhiguo
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [29] Single-molecule targeted accessibility and methylation sequencing of centromeres, telomeres and rDNAs in Arabidopsis
    Mo, Weipeng
    Shu, Yi
    Liu, Bo
    Long, Yanping
    Li, Tong
    Cao, Xiaofeng
    Deng, Xian
    Zhai, Jixian
    NATURE PLANTS, 2023, 9 (09) : 1439 - +
  • [30] Detection of an alcohol-associated cancer marker by single-molecule quantum sequencing
    Komoto, Yuki
    Ohshiro, Takahito
    Taniguchi, Masateru
    CHEMICAL COMMUNICATIONS, 2020, 56 (91) : 14299 - 14302