SpliceJumper: a classification-based approach for calling splicing junctions from RNA-seq data

被引:2
作者
Chu, Chong [1 ]
Li, Xin [1 ]
Wu, Yufeng [1 ]
机构
[1] Univ Connecticut, Dept Comp Sci & Engn, Storrs, CT 06269 USA
来源
BMC BIOINFORMATICS | 2015年 / 16卷
基金
美国国家科学基金会;
关键词
READ ALIGNMENT; ALGORITHM; ACCURATE;
D O I
10.1186/1471-2105-16-S17-S10
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Next-generation RNA sequencing technologies have been widely applied in transcriptome profiling. This facilitates further studies of gene structure and expression on the genome wide scale. It is an important step to align reads to the reference genome and call out splicing junctions for the following analysis, such as the analysis of alternative splicing and isoform construction. However, because of the existence of introns, when RNA-seq reads are aligned to the reference genome, reads can not be fully mapped at splicing sites. Thus, it is challenging to align reads and call out splicing junctions accurately. Results: In this paper, we present a classification based approach for calling splicing junctions from RNA-seq data, which is implemented in the program SpliceJumper. SpliceJumper uses a machine learning approach which combines multiple features extracted from RNA-seq data. We compare SpliceJumper with two existing RNA-seq analysis approaches, TopHat2 and MapSplice2, on both simulated and real data. Our results show that SpliceJumper outperforms TopHat2 and MapSplice2 in accuracy. The program SpliceJumper can be downloaded at https://github.com/Reedwarbler/SpliceJumper.
引用
收藏
页数:11
相关论文
共 22 条
[1]   Mechanisms of alternative pre-messenger RNA splicing [J].
Black, DL .
ANNUAL REVIEW OF BIOCHEMISTRY, 2003, 72 :291-336
[2]   PASS: a program to align short sequences [J].
Campagna, Davide ;
Albiero, Alessandro ;
Bilardi, Alessandra ;
Caniato, Elisa ;
Forcato, Claudio ;
Manavski, Svetlin ;
Vitulo, Nicola ;
Valle, Giorgio .
BIOINFORMATICS, 2009, 25 (07) :967-968
[3]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[4]   Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes [J].
Chen, Rui ;
Mias, George I. ;
Li-Pook-Than, Jennifer ;
Jiang, Lihua ;
Lam, Hugo Y. K. ;
Chen, Rong ;
Miriami, Elana ;
Karczewski, Konrad J. ;
Hariharan, Manoj ;
Dewey, Frederick E. ;
Cheng, Yong ;
Clark, Michael J. ;
Im, Hogune ;
Habegger, Lukas ;
Balasubramanian, Suganthi ;
O'Huallachain, Maeve ;
Dudley, Joel T. ;
Hillenmeyer, Sara ;
Haraksingh, Rajini ;
Sharon, Donald ;
Euskirchen, Ghia ;
Lacroute, Phil ;
Bettinger, Keith ;
Boyle, Alan P. ;
Kasowski, Maya ;
Grubert, Fabian ;
Seki, Scott ;
Garcia, Marco ;
Whirl-Carrillo, Michelle ;
Gallardo, Mercedes ;
Blasco, Maria A. ;
Greenberg, Peter L. ;
Snyder, Phyllis ;
Klein, Teri E. ;
Altman, Russ B. ;
Butte, Atul J. ;
Ashley, Euan A. ;
Gerstein, Mark ;
Nadeau, Kari C. ;
Tang, Hua ;
Snyder, Michael .
CELL, 2012, 148 (06) :1293-1307
[5]  
Chu C, 2014, PLOS ONE, V9
[6]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[7]  
Engström PG, 2013, NAT METHODS, V10, P1185, DOI [10.1038/nmeth.2722, 10.1038/NMETH.2722]
[8]   Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM) [J].
Grant, Gregory R. ;
Farkas, Michael H. ;
Pizarro, Angel D. ;
Lahens, Nicholas F. ;
Schug, Jonathan ;
Brunk, Brian P. ;
Stoeckert, Christian J. ;
Hogenesch, John B. ;
Pierce, Eric A. .
BIOINFORMATICS, 2011, 27 (18) :2518-2528
[9]  
Jean G., 2010, CURR PROTOC BIOINFOR, V32, P11
[10]   TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions [J].
Kim, Daehwan ;
Pertea, Geo ;
Trapnell, Cole ;
Pimentel, Harold ;
Kelley, Ryan ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2013, 14 (04)