De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes

被引:94
作者
Ashrafi, Hamid [1 ]
Hill, Theresa [1 ]
Stoffel, Kevin [1 ]
Kozik, Alexander [2 ]
Yao, JiQiang [1 ]
Chin-Wo, Sebastian Reyes [1 ,2 ]
Van Deynze, Allen [1 ]
机构
[1] Univ Calif Davis, Seed Biotechnol Ctr, Davis, CA 95616 USA
[2] Univ Calif Davis, Genome Ctr, Davis, CA 95616 USA
来源
BMC GENOMICS | 2012年 / 13卷
关键词
Pepper; Capsicum spp; Molecular Markers; EST; Transcriptome; RNAseq; Annotation; SNP; SSR; SPP; LINKAGE MAP; ANNOTATION; DIVERSITY; ALIGNMENT; TOOL;
D O I
10.1186/1471-2164-13-571
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeno and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Results: Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from > 125,000 Sanger-EST sequences that were mainly derived from a Korean F-1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip (R) microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80-120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins. Conclusions: Before availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes
    Hamid Ashrafi
    Theresa Hill
    Kevin Stoffel
    Alexander Kozik
    JiQiang Yao
    Sebastian Reyes Chin-Wo
    Allen Van Deynze
    BMC Genomics, 13
  • [2] Sequencing, de novo assembly and characterization of the spotted scat Scatophagus argus (Linnaeus 1766) transcriptome for discovery of reproduction related genes and SSRs
    Yang Wei
    Chen Huapu
    Cui Xuefan
    Zhang Kewei
    Jiang Dongneng
    Deng Siping
    Zhu Chunhua
    Li Guangli
    JOURNAL OF OCEANOLOGY AND LIMNOLOGY, 2018, 36 (04) : 1329 - 1341
  • [3] Transcriptome profiling and molecular marker discovery in red pepper, Capsicum annuum L. TF68
    Lu, Fu-Hao
    Cho, Myeong-Cheoul
    Park, Yong-Jin
    MOLECULAR BIOLOGY REPORTS, 2012, 39 (03) : 3327 - 3335
  • [4] De novo transcriptome assembly and novel microsatellite marker information in Capsicum annuum varieties Saengryeg 211 and Saengryeg 213
    Ahn, Yul-Kyun
    Tripathi, Swati
    Cho, Young-Il
    Kim, Jeong-Ho
    Lee, Hye-Eun
    Kim, Do-Sun
    Woo, Jong-Gyu
    Cho, Myeong-Cheoul
    BOTANICAL STUDIES, 2013, 54
  • [5] De Novo Transcriptome Assembly of Isatis indigotica at Reproductive Stages and Identification of Candidate Genes Associated with Flowering Pathways
    Bai, Yu
    Zhou, Ying
    Tang, Xiaoqing
    Wang, Yu
    Wang, Fangquan
    Yang, Jie
    JOURNAL OF THE AMERICAN SOCIETY FOR HORTICULTURAL SCIENCE, 2018, 143 (01) : 56 - +
  • [6] Transcriptome profiling and molecular marker discovery in red pepper, Capsicum annuum L. TF68
    Fu-Hao Lu
    Myeong-Cheoul Cho
    Yong-Jin Park
    Molecular Biology Reports, 2012, 39 : 3327 - 3335
  • [7] SNP discovery in radiata pine using a de novo transcriptome assembly
    Duran, Ricardo
    Rodriguez, Victoria
    Carrasco, Angela
    Neale, David
    Balocchi, Claudio
    Valenzuela, Sofia
    TREES-STRUCTURE AND FUNCTION, 2019, 33 (05): : 1505 - 1511
  • [8] De novo assembly, transcriptome characterization and marker discovery in Indian major carp, Labeo rohita through pyrosequencing
    Sahoo, L.
    Das, S. P.
    Bit, A.
    Patnaik, S.
    Mohanty, M.
    Das, G.
    Das, P.
    GENETICA, 2022, 150 (01) : 59 - 66
  • [9] Evolutionary insights from de novo transcriptome assembly and SNP discovery in California white oaks
    Cokus, Shawn J.
    Gugger, Paul F.
    Sork, Victoria L.
    BMC GENOMICS, 2015, 16
  • [10] De novo assembly of Eugenia uniflora L. transcriptome and identification of genes from the terpenoid biosynthesis pathway
    Guzman, Frank
    Kulcheski, Franceli Rodrigues
    Turchetto-Zolet, Andreia Carina
    Margis, Rogerio
    PLANT SCIENCE, 2014, 229 : 238 - 246