rnaSPAdes: a de novo transcriptome assembler and its application to RNA-Seq data

被引:466
作者
Bushmanova, Elena [1 ]
Antipov, Dmitry [1 ]
Lapidus, Alla [1 ]
Prjibelski, Andrey D. [1 ]
机构
[1] St Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, 6 Linia VO 11d, St Petersburg 199004, Russia
基金
俄罗斯基础研究基金会;
关键词
RNA-Seq; de novo assembly; transcriptome assembly; QUALITY ASSESSMENT;
D O I
10.1093/gigascience/giz100
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The possibility of generating large RNA-sequencing datasets has led to development of various reference-based and de novo transcriptome assemblers with their own strengths and limitations. While reference-based tools are widely used in various transcriptomic studies, their application is limited to the organisms with finished and well-annotated genomes. De novo transcriptome reconstruction from short reads remains an open challenging problem, which is complicated by the varying expression levels across different genes, alternative splicing, and paralogous genes. Results: Herein we describe the novel transcriptome assembler rnaSPAdes, which has been developed on top of the SPAdes genome assembler and explores computational parallels between assembly of transcriptomes and single-cell genomes. We also present quality assessment reports for rnaSPAdes assemblies, compare it with modern transcriptome assembly tools using several evaluation approaches on various RNA-sequencing datasets, and briefly highlight strong and weak points of different assemblers. Conclusions: Based on the performed comparison between different assembly methods, we infer that it is not possible to detect the absolute leader according to all quality metrics and all used datasets. However, rnaSPAdes typically outperforms other assemblers by such important property as the number of assembled genes and isoforms, and at the same time has higher accuracy statistics on average comparing to the closest competitors.
引用
收藏
页数:13
相关论文
共 41 条
[1]   A survey of the sorghum transcriptome using single-molecule long reads [J].
Abdel-Ghany, Salah E. ;
Hamilton, Michael ;
Jacobi, Jennifer L. ;
Ngam, Peter ;
Devitt, Nicholas ;
Schilkey, Faye ;
Ben-Hur, Asa ;
Reddy, Anireddy S. N. .
NATURE COMMUNICATIONS, 2016, 7
[2]   HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads [J].
Antipov, Dmitry ;
Korobeynikov, Anton ;
McLean, Jeffrey S. ;
Pevzner, Pavel A. .
BIOINFORMATICS, 2016, 32 (07) :1009-1015
[3]   Deep Evolutionary Comparison of Gene Expression Identifies Parallel Recruitment of Trans-Factors in Two Independent Origins of C4 Photosynthesis [J].
Aubry, Sylvain ;
Kelly, Steven ;
Kuempers, Britta M. C. ;
Smith-Unna, Richard D. ;
Hibberd, Julian M. .
PLOS GENETICS, 2014, 10 (06)
[4]  
Bankevich A, 2016, NAT METHODS, V13, P248, DOI [10.1038/nmeth.3737, 10.1038/NMETH.3737]
[5]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[6]   Trimmomatic: a flexible trimmer for Illumina sequence data [J].
Bolger, Anthony M. ;
Lohse, Marc ;
Usadel, Bjoern .
BIOINFORMATICS, 2014, 30 (15) :2114-2120
[7]   Near-optimal probabilistic RNA-seq quantification (vol 34, pg 525, 2016) [J].
Bray, Nicolas L. ;
Pimentel, Harold ;
Melsted, Pall ;
Pachter, Lior .
NATURE BIOTECHNOLOGY, 2016, 34 (08) :888-888
[8]   rnaQUAST: a quality assessment tool for de novo transcriptome assemblies [J].
Bushmanova, Elena ;
Antipov, Dmitry ;
Lapidus, Alla ;
Suvorov, Vladimir ;
Prjibelski, Andrey D. .
BIOINFORMATICS, 2016, 32 (14) :2210-2212
[9]   Nanopore long-read RNAseq reveals widespread transcriptional variation among the surface receptors of individual B cells [J].
Byrne, Ashley ;
Beaudin, Anna E. ;
Olsen, Hugh E. ;
Jain, Miten ;
Cole, Charles ;
Palmer, Theron ;
DuBois, Rebecca M. ;
Forsberg, E. Camilla ;
Akeson, Mark ;
Vollmers, Christopher .
NATURE COMMUNICATIONS, 2017, 8
[10]   Bridger: a new framework for de novo transcriptome assembly using RNA-seq data [J].
Chang, Zheng ;
Li, Guojun ;
Liu, Juntao ;
Zhang, Yu ;
Ashby, Cody ;
Liu, Deli ;
Cramer, Carole L. ;
Huang, Xiuzhen .
GENOME BIOLOGY, 2015, 16