A high-quality cucumber genome assembly enhances computational comparative genomics

被引:37
作者
Osipowski, Pawel [1 ]
Pawelkowicz, Magdalena [1 ]
Wojcieszek, Michal [1 ]
Skarzynska, Agnieszka [1 ]
Przybecki, Zbigniew [1 ]
Plader, Wojciech [1 ]
机构
[1] Warsaw Univ Life Sci SGGW, Inst Biol, Dept Plant Genet Breeding & Biotechnol, 159 Nowoursynowska St, Warsaw, Poland
关键词
Genome assembly; Variant calling; Comparative genomics; Polymorphism detection; Cucumber; Cucumis sativus L; STRUCTURAL VARIATION; READ ALIGNMENT; GENERATION; ANNOTATION; SEQUENCE; VARIANT; MUTATIONS; NUMBER; ASSOCIATION; INSERTIONS;
D O I
10.1007/s00438-019-01614-3
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genetic variation is expressed by the presence of polymorphisms in compared genomes of individuals that can be transferred to next generations. The aim of this work was to reveal genome dynamics by predicting polymorphisms among the genomes of three individuals of the highly inbred B10 cucumber (Cucumis sativus L.) line. In this study, bioinformatic comparative genomics was used to uncover cucumber genome dynamics (also called real-time evolution). We obtained a new genome draft assembly from long single molecule real-time (SMRT) sequencing reads and used short paired-end read data from three individuals to analyse the polymorphisms. Using this approach, we uncovered differentiation aspects in the genomes of the inbred B10 line. The newly assembled genome sequence (B10v3) has the highest contiguity and quality characteristics among the currently available cucumber genome draft sequences. Standard and newly designed approaches were used to predict single nucleotide and structural variants that were unique among the three individual genomes. Some of the variant predictions spanned protein-coding genes and their promoters, and some were in the neighbourhood of annotated interspersed repetitive elements, indicating that the highly inbred homozygous plants remained genetically dynamic. This is the first bioinformatic comparative genomics study of a single highly inbred plant line. For this project, we developed a polymorphism prediction method with optimized precision parameters, which allowed the effective detection of small nucleotide variants (SNVs). This methodology could significantly improve bioinformatic pipelines for comparative genomics and thus has great practical potential in genomic metadata handling.
引用
收藏
页码:177 / 193
页数:17
相关论文
共 73 条
  • [1] CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing
    Abyzov, Alexej
    Urban, Alexander E.
    Snyder, Michael
    Gerstein, Mark
    [J]. GENOME RESEARCH, 2011, 21 (06) : 974 - 984
  • [2] Arumuganathan K., 1991, Plant Mol. Bop Rept, V9, P208, DOI [10.1007/BF02672069, DOI 10.1007/BF02672069]
  • [3] Assembling large genomes with single-molecule sequencing and locality-sensitive hashing (vol 33, pg 623, 2015)
    Berlin, Konstantin
    Koren, Sergey
    Chin, Chen-Shan
    Drake, James P.
    Landolin, Jane M.
    Phillippy, Adam M.
    [J]. NATURE BIOTECHNOLOGY, 2015, 33 (10) : 1109 - 1109
  • [4] Trimmomatic: a flexible trimmer for Illumina sequence data
    Bolger, Anthony M.
    Lohse, Marc
    Usadel, Bjoern
    [J]. BIOINFORMATICS, 2014, 30 (15) : 2114 - 2120
  • [5] BLAST plus : architecture and applications
    Camacho, Christiam
    Coulouris, George
    Avagyan, Vahram
    Ma, Ning
    Papadopoulos, Jason
    Bealer, Kevin
    Madden, Thomas L.
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [6] Whole-genome sequencing of multiple Arabidopsis thaliana populations
    Cao, Jun
    Schneeberger, Korbinian
    Ossowski, Stephan
    Guenther, Torsten
    Bender, Sebastian
    Fitz, Joffrey
    Koenig, Daniel
    Lanz, Christa
    Stegle, Oliver
    Lippert, Christoph
    Wang, Xi
    Ott, Felix
    Mueller, Jonas
    Alonso-Blanco, Carlos
    Borgwardt, Karsten
    Schmid, Karl J.
    Weigel, Detlef
    [J]. NATURE GENETICS, 2011, 43 (10) : 956 - U60
  • [7] Mechanisms underlying structural variant formation in genomic disorders
    Carvalho, Claudia M. B.
    Lupski, James R.
    [J]. NATURE REVIEWS GENETICS, 2016, 17 (04) : 224 - 238
  • [8] Genome-wide characterization of simple sequence repeats in cucumber (Cucumis sativus L.)
    Cavagnaro, Pablo F.
    Senalik, Douglas A.
    Yang, Luming
    Simon, Philipp W.
    Harkins, Timothy T.
    Kodira, Chinnappa D.
    Huang, Sanwen
    Weng, Yiqun
    [J]. BMC GENOMICS, 2010, 11
  • [9] Resolving the complexity of the human genome using single-molecule sequencing
    Chaisson, Mark J. P.
    Huddleston, John
    Dennis, Megan Y.
    Sudmant, Peter H.
    Malig, Maika
    Hormozdiari, Fereydoun
    Antonacci, Francesca
    Surti, Urvashi
    Sandstrom, Richard
    Boitano, Matthew
    Landolin, Jane M.
    Stamatoyannopoulos, John A.
    Hunkapiller, Michael W.
    Korlach, Jonas
    Eichler, Evan E.
    [J]. NATURE, 2015, 517 (7536) : 608 - U163
  • [10] Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes
    Chen, Rui
    Mias, George I.
    Li-Pook-Than, Jennifer
    Jiang, Lihua
    Lam, Hugo Y. K.
    Chen, Rong
    Miriami, Elana
    Karczewski, Konrad J.
    Hariharan, Manoj
    Dewey, Frederick E.
    Cheng, Yong
    Clark, Michael J.
    Im, Hogune
    Habegger, Lukas
    Balasubramanian, Suganthi
    O'Huallachain, Maeve
    Dudley, Joel T.
    Hillenmeyer, Sara
    Haraksingh, Rajini
    Sharon, Donald
    Euskirchen, Ghia
    Lacroute, Phil
    Bettinger, Keith
    Boyle, Alan P.
    Kasowski, Maya
    Grubert, Fabian
    Seki, Scott
    Garcia, Marco
    Whirl-Carrillo, Michelle
    Gallardo, Mercedes
    Blasco, Maria A.
    Greenberg, Peter L.
    Snyder, Phyllis
    Klein, Teri E.
    Altman, Russ B.
    Butte, Atul J.
    Ashley, Euan A.
    Gerstein, Mark
    Nadeau, Kari C.
    Tang, Hua
    Snyder, Michael
    [J]. CELL, 2012, 148 (06) : 1293 - 1307