A poor man's BLASTX-high-throughput metagenomic protein database search using PAUDA

被引:41
作者
Huson, Daniel H. [1 ,2 ]
Xie, Chao [1 ,3 ]
机构
[1] Nanyang Technol Univ, Sch Biol Sci, Singapore Ctr Environm Life Sci Engn, Singapore 637551, Singapore
[2] Univ Tubingen, Ctr Bioinformat, D-72076 Tubingen, Germany
[3] Natl Univ Singapore, Inst Life Sci, Singapore 117456, Singapore
基金
新加坡国家研究基金会;
关键词
ALIGNMENT;
D O I
10.1093/bioinformatics/btt254
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In the context of metagenomics, we introduce a new approach to protein database search called PAUDA, which runs similar to 10 000 times faster than BLASTX, while achieving about one-third of the assignment rate of reads to KEGG orthology groups, and producing gene and taxon abundance profiles that are highly correlated to those obtained with BLASTX. PAUDA requires <80 CPU hours to analyze a dataset of 246 million Illumina DNA reads from permafrost soil for which a previous BLASTX analysis (on a subset of 176 million reads) reportedly required 800 000 CPU hours, leading to the same clustering of samples by functional profiles.
引用
收藏
页码:38 / 39
页数:2
相关论文
共 8 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Molecular biological access to the chemistry of unknown soil microbes: A new frontier for natural products [J].
Handelsman, J ;
Rondon, MR ;
Brady, SF ;
Clardy, J ;
Goodman, RM .
CHEMISTRY & BIOLOGY, 1998, 5 (10) :R245-R249
[3]   Integrative analysis of environmental sequences using MEGAN4 [J].
Huson, Daniel H. ;
Mitra, Suparna ;
Ruscheweyh, Hans-Joachim ;
Weber, Nico ;
Schuster, Stephan C. .
GENOME RESEARCH, 2011, 21 (09) :1552-1560
[4]   KEGG: Kyoto Encyclopedia of Genes and Genomes [J].
Kanehisa, M ;
Goto, S .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :27-30
[5]  
Langmead B, 2012, NAT METHODS, V9, P357, DOI [10.1038/NMETH.1923, 10.1038/nmeth.1923]
[6]   Metagenomic analysis of a permafrost microbial community reveals a rapid response to thaw [J].
Mackelprang, Rachel ;
Waldrop, Mark P. ;
DeAngelis, Kristen M. ;
David, Maude M. ;
Chavarria, Krystle L. ;
Blazewicz, Steven J. ;
Rubin, Edward M. ;
Jansson, Janet K. .
NATURE, 2011, 480 (7377) :368-U120
[7]   Comparison of multiple metagenomes using phylogenetic networks based on ecological indices [J].
Mitra, Suparna ;
Gilbert, Jack A. ;
Field, Dawn ;
Huson, Daniel H. .
ISME JOURNAL, 2010, 4 (10) :1236-1242
[8]   RAPSearch2: a fast and memory-efficient protein similarity search tool for next-generation sequencing data [J].
Zhao, Yongan ;
Tang, Haixu ;
Ye, Yuzhen .
BIOINFORMATICS, 2012, 28 (01) :125-126