SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads

被引:637
作者
Xie, Yinlong [1 ,2 ,3 ,4 ]
Wu, Gengxiong [2 ]
Tang, Jingbo [2 ,5 ]
Luo, Ruibang [2 ,3 ,4 ,6 ]
Patterson, Jordan [7 ]
Liu, Shanlin [2 ]
Huang, Weihua [2 ]
He, Guangzhu [2 ]
Gu, Shengchang [2 ,8 ]
Li, Shengkang [2 ]
Zhou, Xin [2 ]
Lam, Tak-Wah [3 ,4 ]
Li, Yingrui
Xu, Xun [2 ]
Wong, Gane Ka-Shu [2 ,7 ,9 ]
Wang, Jun [2 ,10 ,11 ,12 ]
机构
[1] S China Univ Technol, Sch Biosci & Bioengn, Guangzhou 510006, Guangdong, Peoples R China
[2] BGI Shenzhen, Shenzhen 518083, Peoples R China
[3] Univ Hong Kong, HKU BGI Bioinformat Algorithms & Core Technol Res, Hong Kong, Hong Kong, Peoples R China
[4] Univ Hong Kong, Dept Comp Sci, Pokfulam, Hong Kong, Peoples R China
[5] Cent S Univ, XiangYa Sch Med, Inst Biomed Engn, Changsha 410008, Hunan, Peoples R China
[6] BGI Shenzhen, BGI Tech, Shenzhen 518083, Peoples R China
[7] Univ Alberta, Dept Med, Edmonton, AB T6G 2E1, Canada
[8] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Peoples R China
[9] Univ Alberta, Dept Biol Sci, Edmonton, AB T6G 2E9, Canada
[10] Univ Copenhagen, Novo Nordisk Fdn, Ctr Basic Metab Res, DK-2200 Copenhagen, Denmark
[11] Univ Copenhagen, Dept Biol, DK-2200 Copenhagen, Denmark
[12] King Abdulaziz Univ, Princess Al Jawhara Ctr Excellence Res Hereditary, Jeddah 21589, Saudi Arabia
关键词
RECONSTRUCTION; REVEALS; GENOME;
D O I
10.1093/bioinformatics/btu077
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Transcriptome sequencing has long been the favored method for quickly and inexpensively obtaining a large number of gene sequences from an organism with no reference genome. Owing to the rapid increase in throughputs and decrease in costs of next- generation sequencing, RNA-Seq in particular has become the method of choice. However, the very short reads (e.g. 2 x 90 bp paired ends) from next generation sequencing makes de novo assembly to recover complete or full-length transcript sequences an algorithmic challenge. Results: Here, we present SOAPdenovo-Trans, a de novo transcriptome assembler designed specifically for RNA-Seq. We evaluated its performance on transcriptome datasets from rice and mouse. Using as our benchmarks the known transcripts from these well-annotated genomes (sequenced a decade ago), we assessed how SOAPdenovo-Trans and two other popular transcriptome assemblers handled such practical issues as alternative splicing and variable expression levels. Our conclusion is that SOAPdenovo-Trans provides higher contiguity, lower redundancy and faster execution.
引用
收藏
页码:1660 / 1666
页数:7
相关论文
共 17 条
[1]   Full-length transcriptome assembly from RNA-Seq data without a reference genome [J].
Grabherr, Manfred G. ;
Haas, Brian J. ;
Yassour, Moran ;
Levin, Joshua Z. ;
Thompson, Dawn A. ;
Amit, Ido ;
Adiconis, Xian ;
Fan, Lin ;
Raychowdhury, Raktima ;
Zeng, Qiandong ;
Chen, Zehua ;
Mauceli, Evan ;
Hacohen, Nir ;
Gnirke, Andreas ;
Rhind, Nicholas ;
di Palma, Federica ;
Birren, Bruce W. ;
Nusbaum, Chad ;
Lindblad-Toh, Kerstin ;
Friedman, Nir ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2011, 29 (07) :644-U130
[2]   Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs [J].
Guttman, Mitchell ;
Garber, Manuel ;
Levin, Joshua Z. ;
Donaghey, Julie ;
Robinson, James ;
Adiconis, Xian ;
Fan, Lin ;
Koziol, Magdalena J. ;
Gnirke, Andreas ;
Nusbaum, Chad ;
Rinn, John L. ;
Lander, Eric S. ;
Regev, Aviv .
NATURE BIOTECHNOLOGY, 2010, 28 (05) :503-U166
[3]  
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202, 10.1101/gr.229202. Article published online before March 2002]
[4]   De novo assembly of human genomes with massively parallel short read sequencing [J].
Li, Ruiqiang ;
Zhu, Hongmei ;
Ruan, Jue ;
Qian, Wubin ;
Fang, Xiaodong ;
Shi, Zhongbin ;
Li, Yingrui ;
Li, Shengting ;
Shan, Gao ;
Kristiansen, Karsten ;
Li, Songgang ;
Yang, Huanming ;
Wang, Jian ;
Wang, Jun .
GENOME RESEARCH, 2010, 20 (02) :265-272
[5]   Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq [J].
Lu BingXin ;
Zeng ZhenBing ;
Shi TieLiu .
SCIENCE CHINA-LIFE SCIENCES, 2013, 56 (02) :143-155
[6]   SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler [J].
Luo, Ruibang ;
Liu, Binghang ;
Xie, Yinlong ;
Li, Zhenyu ;
Huang, Weihua ;
Yuan, Jianying ;
He, Guangzhu ;
Chen, Yanxiang ;
Pan, Qi ;
Liu, Yunjie ;
Tang, Jingbo ;
Wu, Gengxiong ;
Zhang, Hao ;
Shi, Yujian ;
Liu, Yong ;
Yu, Chang ;
Wang, Bo ;
Lu, Yao ;
Han, Changlei ;
Cheung, David W. ;
Yiu, Siu-Ming ;
Peng, Shaoliang ;
Zhu Xiaoqian ;
Liu, Guangming ;
Liao, Xiangke ;
Li, Yingrui ;
Yang, Huanming ;
Wang, Jian ;
Lam, Tak-Wah ;
Wang, Jun .
GIGASCIENCE, 2012, 1
[7]   Transcriptome sequencing to detect gene fusions in cancer [J].
Maher, Christopher A. ;
Kumar-Sinha, Chandan ;
Cao, Xuhong ;
Kalyana-Sundaram, Shanker ;
Han, Bo ;
Jing, Xiaojun ;
Sam, Lee ;
Barrette, Terrence ;
Palanisamy, Nallasivam ;
Chinnaiyan, Arul M. .
NATURE, 2009, 458 (7234) :97-U9
[8]   Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads [J].
Martin, Jeffrey ;
Bruno, Vincent M. ;
Fang, Zhide ;
Meng, Xiandong ;
Blow, Matthew ;
Zhang, Tao ;
Sherlock, Gavin ;
Snyder, Michael ;
Wang, Zhong .
BMC GENOMICS, 2010, 11
[9]   Mapping and quantifying mammalian transcriptomes by RNA-Seq [J].
Mortazavi, Ali ;
Williams, Brian A. ;
McCue, Kenneth ;
Schaeffer, Lorian ;
Wold, Barbara .
NATURE METHODS, 2008, 5 (07) :621-628
[10]   De novo assembly and analysis of RNA-seq data [J].
Robertson, Gordon ;
Schein, Jacqueline ;
Chiu, Readman ;
Corbett, Richard ;
Field, Matthew ;
Jackman, Shaun D. ;
Mungall, Karen ;
Lee, Sam ;
Okada, Hisanaga Mark ;
Qian, Jenny Q. ;
Griffith, Malachi ;
Raymond, Anthony ;
Thiessen, Nina ;
Cezard, Timothee ;
Butterfield, Yaron S. ;
Newsome, Richard ;
Chan, Simon K. ;
She, Rong ;
Varhol, Richard ;
Kamoh, Baljit ;
Prabhu, Anna-Liisa ;
Tam, Angela ;
Zhao, YongJun ;
Moore, Richard A. ;
Hirst, Martin ;
Marra, Marco A. ;
Jones, Steven J. M. ;
Hoodless, Pamela A. ;
Birol, Inanc .
NATURE METHODS, 2010, 7 (11) :909-U62