Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing

被引:110
作者
Franssen, Susanne U. [2 ]
Shrestha, Roshan P. [1 ]
Braeutigam, Andrea [1 ,3 ]
Bornberg-Bauer, Erich [2 ]
Weber, Andreas P. M. [1 ,3 ]
机构
[1] Michigan State Univ, Dept Plant Biol, E Lansing, MI 48823 USA
[2] Univ Munster, Inst Evolut & Biodivers, D-48149 Munster, Germany
[3] Univ Dusseldorf, Inst Plant Biochem, D-40225 Dusseldorf, Germany
来源
BMC GENOMICS | 2011年 / 12卷
关键词
SINGLE-COPY GENE; C-4; PHOTOSYNTHESIS; LIGHT CONTROL; PEA; EXPRESSION; TOOL; METABOLISM; PROTEOMICS; PROTEINS; ENVELOPE;
D O I
10.1186/1471-2164-12-227
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The garden pea, Pisum sativum, is among the best-investigated legume plants and of significant agro-commercial relevance. Pisum sativum has a large and complex genome and accordingly few comprehensive genomic resources exist. Results: We analyzed the pea transcriptome at the highest possible amount of accuracy by current technology. We used next generation sequencing with the Roche/454 platform and evaluated and compared a variety of approaches, including diverse tissue libraries, normalization, alternative sequencing technologies, saturation estimation and diverse assembly strategies. We generated libraries from flowers, leaves, cotyledons, epi- and hypocotyl, and etiolated and light treated etiolated seedlings, comprising a total of 450 megabases. Libraries were assembled into 324,428 unigenes in a first pass assembly. A second pass assembly reduced the amount to 81,449 unigenes but caused a significant number of chimeras. Analyses of the assemblies identified the assembly step as a major possibility for improvement. By recording frequencies of Arabidopsis orthologs hit by randomly drawn reads and fitting parameters of the saturation curve we concluded that sequencing was exhaustive. For leaf libraries we found normalization allows partial recovery of expression strength aside the desired effect of increased coverage. Based on theoretical and biological considerations we concluded that the sequence reads in the database tagged the vast majority of transcripts in the aerial tissues. A pathway representation analysis showed the merits of sampling multiple aerial tissues to increase the number of tagged genes. All results have been made available as a fully annotated database in fasta format. Conclusions: We conclude that the approach taken resulted in a high quality - dataset which serves well as a first comprehensive reference set for the model legume pea. We suggest future deep sequencing transcriptome projects of species lacking a genomics backbone will need to concentrate mainly on resolving the issues of redundancy and paralogy during transcriptome assembly.
引用
收藏
页数:16
相关论文
共 65 条
[1]   Comparative 454 pyrosequencing of transcripts from two olive genotypes during fruit development [J].
Alagna, Fiammetta ;
D'Agostino, Nunzio ;
Torchia, Laura ;
Servili, Maurizio ;
Rao, Rosa ;
Pietrella, Marco ;
Giuliano, Giovanni ;
Chiusano, Maria Luisa ;
Baldoni, Luciana ;
Perrotta, Gaetano .
BMC GENOMICS, 2009, 10 :399
[2]   Improved scoring of functional groups from gene expression data by decorrelating GO graph structure [J].
Alexa, Adrian ;
Rahnenfuehrer, Joerg ;
Lengauer, Thomas .
BIOINFORMATICS, 2006, 22 (13) :1600-1607
[3]  
[Anonymous], Phred, Phrap, and Consed
[4]  
*AR, AR THAL
[5]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[6]  
Barakat A., 2009, BMC PLANT BIOL, V9, P11
[7]   Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes [J].
Blanc, G ;
Wolfe, KH .
PLANT CELL, 2004, 16 (07) :1667-1678
[8]   Low-coverage massively parallel pyrosequencing of cDNAs enables proteomics in non-model species:: Comparison of a species-specific database generated by pyrosequencing with databases from related species for proteome analysis of pea chloroplast envelopes [J].
Braeutigam, Andrea ;
Shrestha, Roshan P. ;
Whitten, Doug ;
Wilkerson, Curtis G. ;
Carr, Kevin M. ;
Froehlich, John E. ;
Weber, Andreas P. M. .
JOURNAL OF BIOTECHNOLOGY, 2008, 136 (1-2) :44-53
[9]   Comparative proteomics of chloroplast envelopes from C3 and C4 plants reveals specific adaptations of the plastid envelope to C4 photosynthesis and candidate proteins required for maintaining C4 metabolite fluxes [J].
Braeutigam, Andrea ;
Hofmann-Benning, Susanne ;
Weber, Andreas P. M. .
PLANT PHYSIOLOGY, 2008, 148 (01) :568-579
[10]   An mRNA Blueprint for C4 Photosynthesis Derived from Comparative Transcriptomics of Closely Related C3 and C4 Species [J].
Braeutigam, Andrea ;
Kajala, Kaisa ;
Wullenweber, Julia ;
Sommer, Manuel ;
Gagneul, David ;
Weber, Katrin L. ;
Carr, Kevin M. ;
Gowik, Udo ;
Mass, Janina ;
Lercher, Martin J. ;
Westhoff, Peter ;
Hibberd, Julian M. ;
Weber, Andreas P. M. .
PLANT PHYSIOLOGY, 2011, 155 (01) :142-156