Classification of metagenomic sequences: methods and challenges

被引:146
作者
Mande, Sharmila S. [1 ]
Mohammed, Monzoorul Haque [1 ]
Ghosh, Tarini Shankar [1 ]
机构
[1] Tata Consultancy Serv Ltd, TCS Innovat Labs, Biosci Div, Pune 411013, Maharashtra, India
关键词
binning algorithms; metagenomics; taxonomic classification; lowest common ancestor; oligo-nucleotide composition; taxonomic diversity; ACCURATE TAXONOMIC CLASSIFICATION; PHYLOGENETIC CLASSIFICATION; DNA-SEQUENCES; ALGORITHM; ALIGNMENT; IDENTIFICATION; METABOLISM; RESOURCE; BACTERIA;
D O I
10.1093/bib/bbs054
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Characterizing the taxonomic diversity of microbial communities is one of the primary objectives of metagenomic studies. Taxonomic analysis of microbial communities, a process referred to as binning, is challenging for the following reasons. Primarily, query sequences originating from the genomes of most microbes in an environmental sample lack taxonomically related sequences in existing reference databases. This absence of a taxonomic context makes binning a very challenging task. Limitations of current sequencing platforms, with respect to short read lengths and sequencing errors/artifacts, are also key factors that determine the overall binning efficiency. Furthermore, the sheer volume of metagenomic datasets also demands highly efficient algorithms that can operate within reasonable requirements of compute power. This review discusses the premise, methodologies, advantages, limitations and challenges of various methods available for binning of metagenomic datasets obtained using the shotgun sequencing approach. Various parameters as well as strategies used for evaluating binning efficiency are then reviewed.
引用
收藏
页码:669 / 681
页数:13
相关论文
共 54 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   PHYLOGENETIC IDENTIFICATION AND IN-SITU DETECTION OF INDIVIDUAL MICROBIAL-CELLS WITHOUT CULTIVATION [J].
AMANN, RI ;
LUDWIG, W ;
SCHLEIFER, KH .
MICROBIOLOGICAL REVIEWS, 1995, 59 (01) :143-169
[3]  
[Anonymous], 2009, PRINCET GUIDE ECOL, DOI DOI 10.1515/9781400833023.257
[4]   The oral metagenome in health and disease [J].
Belda-Ferre, Pedro ;
Alcaraz, Luis David ;
Cabrera-Rubio, Raul ;
Romero, Hector ;
Simon-Soro, Aurea ;
Pignatelli, Miguel ;
Mira, Alex .
ISME JOURNAL, 2012, 6 (01) :46-56
[5]   Aligning short reads to reference alignments and trees [J].
Berger, Simon A. ;
Stamatakis, Alexandros .
BIOINFORMATICS, 2011, 27 (15) :2068-2075
[6]  
Brady A, 2009, NAT METHODS, V6, P673, DOI [10.1038/nmeth.1358, 10.1038/NMETH.1358]
[7]   Binning sequences using very sparse labels within a metagenome [J].
Chan, Chon-Kit Kenneth ;
Hsu, Arthur L. ;
Halgamuge, Saman K. ;
Tang, Sen-Lin .
BMC BIOINFORMATICS, 2008, 9 (1)
[8]  
Chatterji S, 2008, LECT N BIOINFORMAT, V4955, P17
[9]   Impact of 16S rRNA gene sequence analysis for identification of bacteria on clinical microbiology and infectious diseases [J].
Clarridge, JE .
CLINICAL MICROBIOLOGY REVIEWS, 2004, 17 (04) :840-+
[10]   Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence [J].
Cole, ST ;
Brosch, R ;
Parkhill, J ;
Garnier, T ;
Churcher, C ;
Harris, D ;
Gordon, SV ;
Eiglmeier, K ;
Gas, S ;
Barry, CE ;
Tekaia, F ;
Badcock, K ;
Basham, D ;
Brown, D ;
Chillingworth, T ;
Connor, R ;
Davies, R ;
Devlin, K ;
Feltwell, T ;
Gentles, S ;
Hamlin, N ;
Holroyd, S ;
Hornby, T ;
Jagels, K ;
Krogh, A ;
McLean, J ;
Moule, S ;
Murphy, L ;
Oliver, K ;
Osborne, J ;
Quail, MA ;
Rajandream, MA ;
Rogers, J ;
Rutter, S ;
Seeger, K ;
Skelton, J ;
Squares, R ;
Squares, S ;
Sulston, JE ;
Taylor, K ;
Whitehead, S ;
Barrell, BG .
NATURE, 1998, 393 (6685) :537-+