A High-Throughput DNA Sequence Aligner for Microbial Ecology Studies

被引:247
|
作者
Schloss, Patrick D. [1 ,2 ]
机构
[1] Univ Massachusetts, Dept Microbiol, Amherst, MA 01003 USA
[2] Univ Michigan, Dept Microbiol & Immunol, Ann Arbor, MI 48109 USA
来源
PLOS ONE | 2009年 / 4卷 / 12期
基金
美国国家科学基金会;
关键词
ESTIMATING SPECIES RICHNESS; GUT MICROBIOTA; DIVERSITY; ALIGNMENT; PROGRAMS; DATABASE; ARB; BIOSPHERE; SEARCH;
D O I
10.1371/journal.pone.0008230
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
As the scope of microbial surveys expands with the parallel growth in sequencing capacity, a significant bottleneck in data analysis is the ability to generate a biologically meaningful multiple sequence alignment. The most commonly used aligners have varying alignment quality and speed, tend to depend on a specific reference alignment, or lack a complete description of the underlying algorithm. The purpose of this study was to create and validate an aligner with the goal of quickly generating a high quality alignment and having the flexibility to use any reference alignment. Using the simple nearest alignment space termination algorithm, the resulting aligner operates in linear time, requires a small memory footprint, and generates a high quality alignment. In addition, the alignments generated for variable regions were of as high a quality as the alignment of full-length sequences. As implemented, the method was able to align 18 full-length 16S rRNA gene sequences and 58 V2 region sequences per second to the 50,000-column SILVA reference alignment. Most importantly, the resulting alignments were of a quality equal to SILVA-generated alignments. The aligner described in this study will enable scientists to rapidly generate robust multiple sequences alignments that are implicitly based upon the predicted secondary structure of the 16S rRNA molecule. Furthermore, because the implementation is not connected to a specific database it is easy to generalize the method to reference alignments for any DNA sequence.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] WHAM: A High-Throughput Sequence Alignment Method
    Li, Yinan
    Patel, Jignesh M.
    Terrell, Allison
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2012, 37 (04):
  • [32] High-Throughput Sequencing: A Roadmap Toward Community Ecology
    Poisot, Timothee
    Pequin, Berangere
    Gravel, Dominique
    ECOLOGY AND EVOLUTION, 2013, 3 (04): : 1125 - 1139
  • [33] High-throughput Sequence Translation Using CUDA
    Sun Wei-dong
    Ma Zong-min
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 2022 - 2026
  • [34] metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
    Christina Ander
    Ole B Schulz-Trieglaff
    Jens Stoye
    Anthony J Cox
    BMC Bioinformatics, 14
  • [35] Phigaro: high-throughput prophage sequence annotation
    Starikova, Elizaveta V.
    Tikhonova, Polina O.
    Prianichnikov, Nikita A.
    Rands, Chris M.
    Zdobnov, Evgeny M.
    Ilina, Elena N.
    Govorun, Vadim M.
    BIOINFORMATICS, 2020, 36 (12) : 3882 - 3884
  • [36] RUMMAGE - a high-throughput sequence annotation system
    Taudien, S
    Rump, A
    Platzer, M
    Drescher, B
    Schattevoy, R
    Gloeckner, G
    Dette, M
    Baumgart, C
    Weber, J
    Menzel, U
    Rosenthal, A
    TRENDS IN GENETICS, 2000, 16 (11) : 519 - 521
  • [37] metaBEETL: high-throughput analysis of heterogeneous microbial populations from shotgun DNA sequences
    Ander, Christina
    Schulz-Trieglaff, Ole B.
    Stoye, Jens
    Cox, Anthony J.
    BMC BIOINFORMATICS, 2013, 14
  • [38] High-Throughput Sequencing and Metagenomics: Moving Forward in the Culture-Independent Analysis of Food Microbial Ecology
    Ercolini, Danilo
    APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2013, 79 (10) : 3148 - 3155
  • [39] A Novel Compression Algorithm for High-throughput DNA Sequence based on Huffman Coding Method
    He, Chuan
    Zhu, Huaiqiu
    2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2018), 2018,
  • [40] Parallel accelerator design for high-throughput DNA sequence alignment with hash-index
    Wang, W. (wangwendi@ncic.ac.cn), 1600, Science Press (50):