Ironing out the wrinkles in the rare biosphere through improved OTU clustering

被引:1073
作者
Huse, Susan M. [1 ]
Welch, David Mark [1 ]
Morrison, Hilary G. [1 ]
Sogin, Mitchell L. [1 ]
机构
[1] Marine Biol Lab, Josephine Bay Paul Ctr, Woods Hole, MA 02543 USA
基金
美国国家科学基金会;
关键词
MULTIPLE SEQUENCE ALIGNMENT; MICROBIAL DIVERSITY; ACCURACY; QUALITY; SEARCH;
D O I
10.1111/j.1462-2920.2010.02193.x
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Deep sequencing of PCR amplicon libraries facilitates the detection of low-abundance populations in environmental DNA surveys of complex microbial communities. At the same time, deep sequencing can lead to overestimates of microbial diversity through the generation of low-frequency, error-prone reads. Even with sequencing error rates below 0.005 per nucleotide position, the common method of generating operational taxonomic units (OTUs) by multiple sequence alignment and complete-linkage clustering significantly increases the number of predicted OTUs and inflates richness estimates. We show that a 2% single-linkage preclustering methodology followed by an average-linkage clustering based on pairwise alignments more accurately predicts expected OTUs in both single and pooled template preparations of known taxonomic composition. This new clustering method can reduce the OTU richness in environmental samples by as much as 30-60% but does not reduce the fraction of OTUs in long-tailed rank abundance curves that defines the rare biosphere.
引用
收藏
页码:1889 / 1898
页数:10
相关论文
共 25 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] Quality scores and SNP detection in sequencing-by-synthesis systems
    Brockman, William
    Alvarez, Pablo
    Young, Sarah
    Garber, Manuel
    Giannoukos, Georgia
    Lee, William L.
    Russ, Carsten
    Lander, Eric S.
    Nusbaum, Chad
    Jaffe, David B.
    [J]. GENOME RESEARCH, 2008, 18 (05) : 763 - 770
  • [3] NAST: a multiple sequence alignment server for comparative analysis of 16S rRNA genes
    DeSantis, T. Z.
    Hugenholtz, P.
    Keller, K.
    Brodie, E. L.
    Larsen, N.
    Piceno, Y. M.
    Phan, R.
    Andersen, G. L.
    [J]. NUCLEIC ACIDS RESEARCH, 2006, 34 : W394 - W399
  • [4] MUSCLE: multiple sequence alignment with high accuracy and high throughput
    Edgar, RC
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 (05) : 1792 - 1797
  • [5] Base-calling of automated sequencer traces using phred.: II.: Error probabilities
    Ewing, B
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 186 - 194
  • [6] Base-calling of automated sequencer traces using phred.: I.: Accuracy assessment
    Ewing, B
    Hillier, L
    Wendl, MC
    Green, P
    [J]. GENOME RESEARCH, 1998, 8 (03): : 175 - 185
  • [7] The seasonal structure of microbial communities in the Western English Channel
    Gilbert, Jack A.
    Field, Dawn
    Swift, Paul
    Newbold, Lindsay
    Oliver, Anna
    Smyth, Tim
    Somerfield, Paul J.
    Huse, Sue
    Joint, Ian
    [J]. ENVIRONMENTAL MICROBIOLOGY, 2009, 11 (12) : 3132 - 3139
  • [8] Microbial population structures in the deep marine biosphere
    Huber, Julie A.
    Mark Welch, David
    Morrison, Hilary G.
    Huse, Susan M.
    Neal, Phillip R.
    Butterfield, David A.
    Sogin, Mitchell L.
    [J]. SCIENCE, 2007, 318 (5847) : 97 - 100
  • [9] Effect of PCR amplicon size on assessments of clone library microbial diversity and community structure
    Huber, Julie A.
    Morrison, Hilary G.
    Huse, Susan M.
    Neal, Phillip R.
    Sogin, Mitchell L.
    Welch, David B. Mark
    [J]. ENVIRONMENTAL MICROBIOLOGY, 2009, 11 (05) : 1292 - 1302
  • [10] Accuracy and quality of massively parallel DNA pyrosequencing
    Huse, Susan M.
    Huber, Julie A.
    Morrison, Hilary G.
    Sogin, Mitchell L.
    Mark Welch, David
    [J]. GENOME BIOLOGY, 2007, 8 (07)