Single-molecule sequencing of the desiccation-tolerant grass Oropetium thomaeum

被引:221
|
作者
VanBuren, Robert [1 ]
Bryant, Doug [1 ]
Edger, Patrick P. [2 ,3 ]
Tang, Haibao [4 ,5 ]
Burgess, Diane [2 ]
Challabathula, Dinakar [6 ]
Spittle, Kristi [7 ]
Hall, Richard [7 ]
Gu, Jenny [7 ]
Lyons, Eric [4 ]
Freeling, Michael [2 ]
Bartels, Dorothea [6 ]
Ten Hallers, Boudewijn [8 ]
Hastie, Alex [8 ]
Michael, Todd P. [9 ]
Mockler, Todd C. [1 ]
机构
[1] Donald Danforth Plant Sci Ctr, St Louis, MO 63132 USA
[2] Univ Calif Berkeley, Dept Plant & Microbial Biol, Berkeley, CA 94720 USA
[3] Michigan State Univ, Dept Hort, E Lansing, MI 48323 USA
[4] Univ Arizona, Sch Plant Sci, IPlant Collaborat, Tucson, AZ 85721 USA
[5] Fujian Agr & Forestry Univ, HIST, Ctr Genom & Biotechnol, Fuzhou 350002, Peoples R China
[6] Univ Bonn, IMBIO, D-53115 Bonn, Germany
[7] Pacific Biosci, Menlo Pk, CA 94025 USA
[8] BioNano Genom, San Diego, CA 92121 USA
[9] Ibis Biosci, Carlsbad, CA 92008 USA
基金
美国国家科学基金会;
关键词
STRUCTURAL VARIATION; GENOME COMPARISONS; TANDEM REPEATS; DNA; REVEALS; GENE; SIZE; IDENTIFICATION; TRANSCRIPTOME; COMPLEXITY;
D O I
10.1038/nature15714
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Plant genomes, and eukaryotic genomes in general, are typically repetitive, polyploid and heterozygous, which complicates genome assembly(1). The short read lengths of early Sanger and current next-generation sequencing platforms hinder assembly through complex repeat regions, and many draft and reference genomes are fragmented, lacking skewed GC and repetitive intergenic sequences, which are gaining importance due to projects like the Encyclopedia of DNA Elements (ENCODE)(2). Here we report the whole-genome sequencing and assembly of the desiccation-tolerant grass Oropetium thomaeum. Using only single-molecule real-time sequencing, which generates long (>16 kilobases) reads with random errors, we assembled 99% (244 megabases) of the Oropetium genome into 625 contigs with an N50 length of 2.4 megabases. Oropetium is an example of a 'near-complete' draft genome which includes gapless coverage over gene space as well as intergenic sequences such as centromeres, telomeres, transposable elements and rRNA clusters that are typically unassembled in draft genomes. Oropetium has 28,466 protein-coding genes and 43% repeat sequences, yet with 30% more compact euchromatic regions it is the smallest known grass genome. The Oropetium genome demonstrates the utility of single-molecule real-time sequencing for assembling high-quality plant and other eukaryotic genomes, and serves as a valuable resource for the plant comparative genomics community.
引用
收藏
页码:508 / U209
页数:16
相关论文
共 50 条
  • [41] Characterization of the Rosellinia necatrix Transcriptome and Genes Related to Pathogenesis by Single-Molecule mRNA Sequencing
    Kim, Hyeongmin
    Lee, Seung Jae
    Jo, Ick-Hyun
    Lee, Jinsu
    Bae, Wonsil
    Kim, Hyemin
    Won, Kyungho
    Hyun, Tae Kyung
    Ryu, Hojin
    PLANT PATHOLOGY JOURNAL, 2017, 33 (04) : 362 - 369
  • [42] Comprehensive analysis of structural variants in breast cancer genomes using single-molecule sequencing
    Aganezov, Sergey
    Goodwin, Sara
    Sherman, Rachel M.
    Sedlazeck, Fritz J.
    Arun, Gayatri
    Bhatia, Sonam
    Lee, Isac
    Kirsche, Melanie
    Wappel, Robert
    Kramer, Melissa
    Kostroff, Karen
    Spector, David L.
    Timp, Winston
    McCombie, W. Richard
    Schatz, Michael C.
    GENOME RESEARCH, 2020, 30 (09) : 1258 - 1273
  • [43] Modeling single-molecule stochastic transport for DNA exo-sequencing in nanopore sensors
    Stadlbauer, Benjamin
    Mitscha-Baude, Gregor
    Heitzinger, Clemens
    NANOTECHNOLOGY, 2020, 31 (07)
  • [44] Massively parallel analysis of single-molecule dynamics on next-generation sequencing chips
    Rivera, J. Aguirre
    Mao, G.
    Sabantsev, A.
    Panfilov, M.
    Hou, Q.
    Lindell, M.
    Chanez, C.
    Ritort, F.
    Jinek, M.
    Deindl, S.
    SCIENCE, 2024, 385 (6711) : 892 - 898
  • [45] A transcriptome atlas of rabbit revealed by PacBio single-molecule long-read sequencing
    Chen, Shi-Yi
    Deng, Feilong
    Jia, Xianbo
    Li, Cao
    Lai, Song-Jia
    SCIENTIFIC REPORTS, 2017, 7
  • [46] Construction of JRG (Japanese reference genome) with single-molecule real-time sequencing
    Nagasaki, Masao
    Kuroki, Yoko
    Shibata, Tomoko F.
    Katsuoka, Fumiki
    Mimori, Takahiro
    Kawai, Yosuke
    Minegishi, Naoko
    Hozawa, Atsushi
    Kuriyama, Shinichi
    Suzuki, Yoichi
    Kawame, Hiroshi
    Nagami, Fuji
    Takai-Igarashi, Takako
    Ogishima, Soichi
    Kojima, Kaname
    Misawa, Kazuharu
    Tanabe, Osamu
    Fuse, Nobuo
    Tanaka, Hiroshi
    Yaegashi, Nobuo
    Kinoshita, Kengo
    Kure, Shiego
    Yasuda, Jun
    Yamamoto, Masayuki
    HUMAN GENOME VARIATION, 2019, 6 (1)
  • [47] Single-molecule DNA-mapping and whole-genome sequencing of individual cells
    Marie, Rodolphe
    Pedersen, Jonas N.
    Baerlocher, Loic
    Koprowska, Kamila
    Podenphant, Marie
    Sabatel, Celine
    Zalkovskij, Maksim
    Mironov, Andrej
    Bilenberg, Brian
    Ashley, Neil
    Flyvbjerg, Henrik
    Bodmer, Walter F.
    Kristensen, Anders
    Mir, Kalim U.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2018, 115 (44) : 11192 - 11197
  • [48] Detecting PKD1 variants in polycystic kidney disease patients by single-molecule long-read sequencing
    Borras, Daniel M.
    Vossen, Rolf H. A. M.
    Liem, Michael
    Buermans, Henk P. J.
    Dauwerse, Hans
    van Heusden, Dave
    Gansevoort, Ron T.
    den Dunnen, Johan T.
    Janssen, Bart
    Peters, Dorien J. M.
    Losekoot, Monique
    Anvar, Seyed Yahya
    HUMAN MUTATION, 2017, 38 (07) : 870 - 879
  • [49] Identification of genomic insertion and flanking sequences of the transgenic drought-tolerant maize line "SbSNAC1-382" using the single-molecule real-time (SMRT) sequencing method
    Zeng, Tingru
    Zhang, Dengfeng
    Li, Yongxiang
    Li, Chunhui
    Liu, Xuyang
    Shi, Yunsu
    Song, Yanchun
    Li, Yu
    Wang, Tianyu
    PLOS ONE, 2020, 15 (04):
  • [50] Identification of alternatively spliced gene isoforms and novel noncoding RNAs by single-molecule long-read sequencing in Camellia
    Hu, Zhikang
    Lyu, Tao
    Yan, Chao
    Wang, Yupeng
    Ye, Ning
    Fan, Zhengqi
    Li, Xinlei
    Li, Jiyuan
    Yin, Hengfu
    RNA BIOLOGY, 2020, 17 (07) : 966 - 976