De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes

被引:93
|
作者
Ashrafi, Hamid [1 ]
Hill, Theresa [1 ]
Stoffel, Kevin [1 ]
Kozik, Alexander [2 ]
Yao, JiQiang [1 ]
Chin-Wo, Sebastian Reyes [1 ,2 ]
Van Deynze, Allen [1 ]
机构
[1] Univ Calif Davis, Seed Biotechnol Ctr, Davis, CA 95616 USA
[2] Univ Calif Davis, Genome Ctr, Davis, CA 95616 USA
来源
BMC GENOMICS | 2012年 / 13卷
关键词
Pepper; Capsicum spp; Molecular Markers; EST; Transcriptome; RNAseq; Annotation; SNP; SSR; SPP; LINKAGE MAP; ANNOTATION; DIVERSITY; ALIGNMENT; TOOL;
D O I
10.1186/1471-2164-13-571
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Molecular breeding of pepper (Capsicum spp.) can be accelerated by developing DNA markers associated with transcriptomes in breeding germplasm. Before the advent of next generation sequencing (NGS) technologies, the majority of sequencing data were generated by the Sanger sequencing method. By leveraging Sanger EST data, we have generated a wealth of genetic information for pepper including thousands of SNPs and Single Position Polymorphic (SPP) markers. To complement and enhance these resources, we applied NGS to three pepper genotypes: Maor, Early Jalapeno and Criollo de Morelos-334 (CM334) to identify SNPs and SSRs in the assembly of these three genotypes. Results: Two pepper transcriptome assemblies were developed with different purposes. The first reference sequence, assembled by CAP3 software, comprises 31,196 contigs from > 125,000 Sanger-EST sequences that were mainly derived from a Korean F-1-hybrid line, Bukang. Overlapping probes were designed for 30,815 unigenes to construct a pepper Affymetrix GeneChip (R) microarray for whole genome analyses. In addition, custom Python scripts were used to identify 4,236 SNPs in contigs of the assembly. A total of 2,489 simple sequence repeats (SSRs) were identified from the assembly, and primers were designed for the SSRs. Annotation of contigs using Blast2GO software resulted in information for 60% of the unigenes in the assembly. The second transcriptome assembly was constructed from more than 200 million Illumina Genome Analyzer II reads (80-120 nt) using a combination of Velvet, CLC workbench and CAP3 software packages. BWA, SAMtools and in-house Perl scripts were used to identify SNPs among three pepper genotypes. The SNPs were filtered to be at least 50 bp from any intron-exon junctions as well as flanking SNPs. More than 22,000 high-quality putative SNPs were identified. Using the MISA software, 10,398 SSR markers were also identified within the Illumina transcriptome assembly and primers were designed for the identified markers. The assembly was annotated by Blast2GO and 14,740 (12%) of annotated contigs were associated with functional proteins. Conclusions: Before availability of pepper genome sequence, assembling transcriptomes of this economically important crop was required to generate thousands of high-quality molecular markers that could be used in breeding programs. In order to have a better understanding of the assembled sequences and to identify candidate genes underlying QTLs, we annotated the contigs of Sanger-EST and Illumina transcriptome assemblies. These and other information have been curated in a database that we have dedicated for pepper project.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] De novo assembly of the pepper transcriptome (Capsicum annuum): a benchmark for in silico discovery of SNPs, SSRs and candidate genes
    Hamid Ashrafi
    Theresa Hill
    Kevin Stoffel
    Alexander Kozik
    JiQiang Yao
    Sebastian Reyes Chin-Wo
    Allen Van Deynze
    BMC Genomics, 13
  • [2] De Novo Transcriptome Assembly in Chili Pepper (Capsicum frutescens) to Identify Genes Involved in the Biosynthesis of Capsaicinoids
    Liu, Shaoqun
    Li, Wanshun
    Wu, Yimin
    Chen, Changming
    Lei, Jianjun
    PLOS ONE, 2013, 8 (01):
  • [3] Pepper EST database: comprehensive in silico tool for analyzing the chili pepper (Capsicum annuum) transcriptome
    Hyun-Jin Kim
    Kwang-Hyun Baek
    Seung-Won Lee
    JungEun Kim
    Bong-Woo Lee
    Hye-Sun Cho
    Woo Taek Kim
    Doil Choi
    Cheol-Goo Hur
    BMC Plant Biology, 8
  • [4] Pepper EST database: comprehensive in silico tool for analyzing the chili pepper (Capsicum annuum) transcriptome
    Kim, Hyun-Jin
    Baek, Kwang-Hyun
    Lee, Seung-Won
    Kim, JungEun
    Lee, Bong-Woo
    Cho, Hye-Sun
    Kim, Woo Taek
    Choi, Doil
    Hur, Cheol-Goo
    BMC PLANT BIOLOGY, 2008, 8 (1)
  • [5] Sequencing, de novo assembly and characterization of the spotted scat Scatophagus argus(Linnaeus 1766) transcriptome for discovery of reproduction related genes and SSRs
    杨尉
    陈华谱
    崔雪凡
    张克伟
    江东能
    邓思平
    朱春华
    李广丽
    JournalofOceanologyandLimnology, 2018, 36 (04) : 1329 - 1341
  • [6] Sequencing, de novo assembly and characterization of the spotted scat Scatophagus argus (Linnaeus 1766) transcriptome for discovery of reproduction related genes and SSRs
    Wei Yang
    Huapu Chen
    Xuefan Cui
    Kewei Zhang
    Dongneng Jiang
    Siping Deng
    Chunhua Zhu
    Guangli Li
    Journal of Oceanology and Limnology, 2018, 36 : 1329 - 1341
  • [7] Sequencing, de novo assembly and characterization of the spotted scat Scatophagus argus (Linnaeus 1766) transcriptome for discovery of reproduction related genes and SSRs
    Yang Wei
    Chen Huapu
    Cui Xuefan
    Zhang Kewei
    Jiang Dongneng
    Deng Siping
    Zhu Chunhua
    Li Guangli
    JOURNAL OF OCEANOLOGY AND LIMNOLOGY, 2018, 36 (04) : 1329 - 1341
  • [8] De novo transcriptome assembly and novel microsatellite marker information in Capsicum annuum varieties Saengryeg 211 and Saengryeg 213
    Ahn, Yul-Kyun
    Tripathi, Swati
    Cho, Young-Il
    Kim, Jeong-Ho
    Lee, Hye-Eun
    Kim, Do-Sun
    Woo, Jong-Gyu
    Cho, Myeong-Cheoul
    BOTANICAL STUDIES, 2013, 54
  • [9] De novo transcriptome assembly and novel microsatellite marker information in Capsicum annuum varieties Saengryeg 211 and Saengryeg 213
    Yul-Kyun Ahn
    Swati Tripathi
    Young-Il Cho
    Jeong-Ho Kim
    Hye-Eun Lee
    Do-Sun Kim
    Jong-Gyu Woo
    Myeong-Cheoul Cho
    Botanical Studies, 54
  • [10] De novo transcriptome assembly of the cotyledon of Camellia oleifera for discovery of genes regulating seed germination
    Wei Long
    Xiaohua Yao
    Kailiang Wang
    Yu Sheng
    Leyan Lv
    BMC Plant Biology, 22