SL-quant: a fast and flexible pipeline to quantify spliced leader trans-splicing events from RNA-seq data

被引:10
作者
Yague-Sanz, Carlo [1 ]
Hermand, Damien [1 ]
机构
[1] Univ Namur UNamur, URPhyM GEMO, 61 Rue Bruxelles, B-5000 Namur, Belgium
来源
GIGASCIENCE | 2018年 / 7卷 / 07期
关键词
NGS; RNA-seq; maturation; trans-splicing; sequence analysis; GENOME;
D O I
10.1093/gigascience/giy084
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The spliceosomal transfer of a short spliced leader (SL) RNA to an independent pre-mRNA molecule is called SL trans-splicing and is widespread in the nematode Caenorhabditis elegans. While RNA-sequencing (RNA-seq) data contain information on such events, properly documented methods to extract them are lacking. Findings: To address this, we developed SL-quant, a fast and flexible pipeline that adapts to paired-end and single-end RNA-seq data and accurately quantifies SL trans-splicing events. It is designed to work downstream of read mapping and uses the reads left unmapped as primary input. Briefly, the SL sequences are identified with high specificity and are trimmed from the input reads, which are then remapped on the reference genome and quantified at the nucleotide position level (SL trans-splice sites) or at the gene level. Conclusions: SL-quant completes within 10 minutes on a basic desktop computer for typical C. elegans RNA-seq datasets and can be applied to other species as well. Validating the method, the SL trans-splice sites identified display the expected consensus sequence, and the results of the gene-level quantification are predictive of the gene position within operons. We also compared SL-quant to a recently published SL-containing read identification strategy that was found to be more sensitive but less specific than SL-quant. Both methods are implemented as a bash script available under the MIT license [1]. Full instructions for its installation, usage, and adaptation to other organisms are provided.
引用
收藏
页数:7
相关论文
共 35 条
[1]   Sequencing of first-strand cDNA library reveals full-length transcriptomes [J].
Agarwal, Saurabh ;
Macfarlan, Todd S. ;
Sartor, Maureen A. ;
Iwase, Shigeki .
NATURE COMMUNICATIONS, 2015, 6
[2]   A global analysis of C. elegans trans-splicing [J].
Allen, Mary Ann ;
Hillier, LaDeana W. ;
Waterston, Robert H. ;
Blumenthal, Thomas .
GENOME RESEARCH, 2011, 21 (02) :255-264
[3]  
[Anonymous], 2017, R LANG ENV STAT COMP
[4]   Coupling mRNA processing with transcription in time and space [J].
Bentley, David L. .
NATURE REVIEWS GENETICS, 2014, 15 (03) :163-175
[5]  
Bitar Maina, 2013, Frontiers in Genetics, V4, P199, DOI 10.3389/fgene.2013.00199
[6]   A global analysis of Caenorhabditis elegans operons [J].
Blumenthal, T ;
Evans, D ;
Link, CD ;
Guffanti, A ;
Lawson, D ;
Thierry-Mieg, J ;
Thierry-Mieg, D ;
Chiu, WL ;
Duke, K ;
Kiraly, M ;
Kim, SK .
NATURE, 2002, 417 (6891) :851-854
[7]  
Blumenthal Thomas, 2012, WormBook, P1, DOI 10.1895/wormbook.1.5.2
[8]   The time-resolved transcriptome of C. elegans [J].
Boeck, Max E. ;
Chau Huynh ;
Gevirtzman, Lou ;
Thompson, Owen A. ;
Wang, Guilin ;
Kasper, Dionna M. ;
Reinke, Valerie ;
Hillier, LaDeana W. ;
Waterston, Robert H. .
GENOME RESEARCH, 2016, 26 (10) :1441-1450
[9]   Trimmomatic: a flexible trimmer for Illumina sequence data [J].
Bolger, Anthony M. ;
Lohse, Marc ;
Usadel, Bjoern .
BIOINFORMATICS, 2014, 30 (15) :2114-2120
[10]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10