A Low-Cost Library Construction Protocol and Data Analysis Pipeline for Illumina-Based Strand-Specific Multiplex RNA-Seq

被引:100
作者
Wang, Lin [1 ]
Si, Yaqing [2 ]
Dedow, Lauren K. [1 ]
Shao, Ying [3 ]
Liu, Peng [2 ]
Brutnell, Thomas P. [1 ]
机构
[1] Cornell Univ, Boyce Thompson Inst Plant Res, Ithaca, NY 14853 USA
[2] Iowa State Univ, Dept Stat, Ames, IA USA
[3] Weill Cornell Med Coll, Dept Microbiol & Immunol, New York, NY USA
来源
PLOS ONE | 2011年 / 6卷 / 10期
基金
美国国家科学基金会;
关键词
TRANSCRIPTOMES; EXPRESSION; ALIGNMENT;
D O I
10.1371/journal.pone.0026426
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The emergence of NextGen sequencing technology has generated much interest in the exploration of transcriptomes. Currently, Illumina Inc. (San Diego, CA) provides one of the most widely utilized sequencing platforms for gene expression analysis. While Illumina reagents and protocols perform adequately in RNA-sequencing (RNA-seq), alternative reagents and protocols promise a higher throughput at a much lower cost. We have developed a low-cost and robust protocol to produce Illumina-compatible (GAIIx and HiSeq2000 platforms) RNA-seq libraries by combining several recent improvements. First, we designed balanced adapter sequences for multiplexing of samples; second, dUTP incorporation in 2(nd) strand synthesis was used to enforce strand-specificity; third, we simplified RNA purification, fragmentation and library size-selection steps thus drastically reducing the time and increasing throughput of library construction; fourth, we included an RNA spike-in control for validation and normalization purposes. To streamline informatics analysis for the community, we established a pipeline within the iPlant Collaborative. These scripts are easily customized to meet specific research needs and improve on existing informatics and statistical treatments of RNA-seq data. In particular, we apply significance tests for determining differential gene expression and intron retention events. To demonstrate the potential of both the library-construction protocol and data-analysis pipeline, we characterized the transcriptome of the rice leaf. Our data supports novel gene models and can be used to improve current rice genome annotation. Additionally, using the rice transcriptome data, we compared different methods of calculating gene expression and discuss the advantages of a strand-specific approach to detect bona-fide anti-sense transcripts and to detect intron retention events. Our results demonstrate the potential of this low cost and robust method for RNA-seq library construction and data analysis.
引用
收藏
页数:12
相关论文
共 36 条
[1]   Supersplat-spliced RNA-seq alignment [J].
Bryant, Douglas W., Jr. ;
Shen, Rongkun ;
Priest, Henry D. ;
Wong, Weng-Keen ;
Mockler, Todd C. .
BIOINFORMATICS, 2010, 26 (12) :1500-1505
[2]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[3]   Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning [J].
Cokus, Shawn J. ;
Feng, Suhua ;
Zhang, Xiaoyu ;
Chen, Zugen ;
Merriman, Barry ;
Haudenschild, Christian D. ;
Pradhan, Sriharsa ;
Nelson, Stanley F. ;
Pellegrini, Matteo ;
Jacobsen, Steven E. .
NATURE, 2008, 452 (7184) :215-219
[4]  
Craig DW, 2008, NAT METHODS, V5, P887, DOI [10.1038/nmeth.1251, 10.1038/NMETH.1251]
[5]   Genome-wide mapping of alternative splicing in Arabidopsis thaliana [J].
Filichkin, Sergei A. ;
Priest, Henry D. ;
Givan, Scott A. ;
Shen, Rongkun ;
Bryant, Douglas W. ;
Fox, Samuel E. ;
Wong, Weng-Keen ;
Mockler, Todd C. .
GENOME RESEARCH, 2010, 20 (01) :45-58
[6]   Localizing hotspots of antisense transcription [J].
Finocchiaro, Giacomo ;
Carro, Maria Stella ;
Francois, Stephanie ;
Parise, Paola ;
DiNinni, Valentina ;
Muller, Heiko .
NUCLEIC ACIDS RESEARCH, 2007, 35 (05) :1488-1500
[7]   Power and sample size estimation in high dimensional biology [J].
Gadbury, GL ;
Page, GP ;
Edwards, J ;
Kayo, T ;
Prolla, TA ;
Weindruch, R ;
Permana, PA ;
Mountz, JD ;
Allison, DB .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2004, 13 (04) :325-338
[8]   High-throughput genotyping by whole-genome resequencing [J].
Huang, Xuehui ;
Feng, Qi ;
Qian, Qian ;
Zhao, Qiang ;
Wang, Lu ;
Wang, Ahong ;
Guan, Jianping ;
Fan, Danlin ;
Weng, Qijun ;
Huang, Tao ;
Dong, Guojun ;
Sang, Tao ;
Han, Bin .
GENOME RESEARCH, 2009, 19 (06) :1068-1076
[9]   Optimal Tests Shrinking Both Means and Variances Applicable to Microarray Data Analysis [J].
Hwang, J. T. Gene ;
Liu, Peng .
STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2010, 9 (01)
[10]   Disclosing hidden transcripts: Mouse natural sense-antisense transcripts tend to be poly(A) negative and nuclear localized [J].
Kiyosawa, H ;
Mise, N ;
Iwase, S ;
Hayashizaki, Y ;
Abe, K .
GENOME RESEARCH, 2005, 15 (04) :463-474