Fulcrum: condensing redundant reads from high-throughput sequencing studies

被引:27
作者
Burriesci, Matthew S. [1 ]
Lehnert, Erik M. [1 ]
Pringle, John R. [1 ]
机构
[1] Stanford Univ, Dept Genet, Sch Med, Stanford, CA 94305 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
GENERATION; ERRORS; GENOME;
D O I
10.1093/bioinformatics/bts123
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Results: We developed Fulcrum to collapse identical and near-identical Illumina and 454 reads (such as those from PCR clones) into single error-corrected sequences; it can process paired-end as well as single-end reads. Fulcrum is customizable and can be deployed on a single machine, a local network or a commercially available MapReduce cluster, and it has been optimized to maximize ease-of-use, cross-platform compatibility and future scalability. Sequence datasets have been collapsed by up to 71%, and the reduced number and improved quality of the resulting sequences allow assemblers to produce longer contigs while using less memory.
引用
收藏
页码:1324 / 1327
页数:4
相关论文
共 12 条
[1]  
Flicek P, 2009, NAT METHODS, V6, pS6, DOI [10.1038/NMETH.1376, 10.1038/nmeth.1376]
[2]   Galaxy: A platform for interactive large-scale genome analysis [J].
Giardine, B ;
Riemer, C ;
Hardison, RC ;
Burhans, R ;
Elnitski, L ;
Shah, P ;
Zhang, Y ;
Blankenberg, D ;
Albert, I ;
Taylor, J ;
Miller, W ;
Kent, WJ ;
Nekrutenko, A .
GENOME RESEARCH, 2005, 15 (10) :1451-1455
[3]  
Hiatt JB, 2010, NAT METHODS, V7, P119, DOI [10.1038/NMETH.1416, 10.1038/nmeth.1416]
[4]   Quake: quality-aware detection and correction of sequencing errors [J].
Kelley, David R. ;
Schatz, Michael C. ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2010, 11 (11)
[5]   The sequence and de novo assembly of the giant panda genome [J].
Li, Ruiqiang ;
Fan, Wei ;
Tian, Geng ;
Zhu, Hongmei ;
He, Lin ;
Cai, Jing ;
Huang, Quanfei ;
Cai, Qingle ;
Li, Bo ;
Bai, Yinqi ;
Zhang, Zhihe ;
Zhang, Yaping ;
Wang, Wen ;
Li, Jun ;
Wei, Fuwen ;
Li, Heng ;
Jian, Min ;
Li, Jianwen ;
Zhang, Zhaolei ;
Nielsen, Rasmus ;
Li, Dawei ;
Gu, Wanjun ;
Yang, Zhentao ;
Xuan, Zhaoling ;
Ryder, Oliver A. ;
Leung, Frederick Chi-Ching ;
Zhou, Yan ;
Cao, Jianjun ;
Sun, Xiao ;
Fu, Yonggui ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Wang, Bo ;
Hou, Rong ;
Shen, Fujun ;
Mu, Bo ;
Ni, Peixiang ;
Lin, Runmao ;
Qian, Wubin ;
Wang, Guodong ;
Yu, Chang ;
Nie, Wenhui ;
Wang, Jinhuan ;
Wu, Zhigang ;
Liang, Huiqing ;
Min, Jiumeng ;
Wu, Qi ;
Cheng, Shifeng ;
Ruan, Jue ;
Wang, Mingwei .
NATURE, 2010, 463 (7279) :311-317
[6]   Targeted sequencing of the human X chromosome exome [J].
Mondal, Kajari ;
Shetty, Amol Carl ;
Patel, Viren ;
Cutler, David J. ;
Zwick, Michael E. .
GENOMICS, 2011, 98 (04) :260-265
[7]   Efficient frequency-based de novo short-read clustering for error trimming in next-generation sequencing [J].
Qu, Wei ;
Hashimoto, Shin-ichi ;
Morishita, Shinichi .
GENOME RESEARCH, 2009, 19 (07) :1309-1315
[8]   Correction of sequencing errors in a mixed set of reads [J].
Salmela, Leena .
BIOINFORMATICS, 2010, 26 (10) :1284-1290
[9]   Assembly of large genomes using second-generation sequencing [J].
Schatz, Michael C. ;
Delcher, Arthur L. ;
Salzberg, Steven L. .
GENOME RESEARCH, 2010, 20 (09) :1165-1173
[10]   SHREC: a short-read error correction method [J].
Schroeder, Jan ;
Schroeder, Heiko ;
Puglisi, Simon J. ;
Sinha, Ranjan ;
Schmidt, Bertil .
BIOINFORMATICS, 2009, 25 (17) :2157-2163