A High-Throughput DNA Sequence Aligner for Microbial Ecology Studies

被引:247
|
作者
Schloss, Patrick D. [1 ,2 ]
机构
[1] Univ Massachusetts, Dept Microbiol, Amherst, MA 01003 USA
[2] Univ Michigan, Dept Microbiol & Immunol, Ann Arbor, MI 48109 USA
来源
PLOS ONE | 2009年 / 4卷 / 12期
基金
美国国家科学基金会;
关键词
ESTIMATING SPECIES RICHNESS; GUT MICROBIOTA; DIVERSITY; ALIGNMENT; PROGRAMS; DATABASE; ARB; BIOSPHERE; SEARCH;
D O I
10.1371/journal.pone.0008230
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
As the scope of microbial surveys expands with the parallel growth in sequencing capacity, a significant bottleneck in data analysis is the ability to generate a biologically meaningful multiple sequence alignment. The most commonly used aligners have varying alignment quality and speed, tend to depend on a specific reference alignment, or lack a complete description of the underlying algorithm. The purpose of this study was to create and validate an aligner with the goal of quickly generating a high quality alignment and having the flexibility to use any reference alignment. Using the simple nearest alignment space termination algorithm, the resulting aligner operates in linear time, requires a small memory footprint, and generates a high quality alignment. In addition, the alignments generated for variable regions were of as high a quality as the alignment of full-length sequences. As implemented, the method was able to align 18 full-length 16S rRNA gene sequences and 58 V2 region sequences per second to the 50,000-column SILVA reference alignment. Most importantly, the resulting alignments were of a quality equal to SILVA-generated alignments. The aligner described in this study will enable scientists to rapidly generate robust multiple sequences alignments that are implicitly based upon the predicted secondary structure of the 16S rRNA molecule. Furthermore, because the implementation is not connected to a specific database it is easy to generalize the method to reference alignments for any DNA sequence.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] High-throughput DNA sequence data compression
    Zhu, Zexuan
    Zhang, Yongpeng
    Ji, Zhen
    He, Shan
    Yang, Xiao
    BRIEFINGS IN BIOINFORMATICS, 2015, 16 (01) : 1 - 15
  • [2] High-Throughput Block Optical DNA Sequence Identification
    Sagar, Dodderi Manjunatha
    Korshoj, Lee Erik
    Hanson, Katrina Bethany
    Chowdhury, Partha Pratim
    Otoupal, Peter Britton
    Chatterjee, Anushree
    Nagpal, Prashant
    SMALL, 2018, 14 (04)
  • [3] Multivariate analysis of complex DNA sequence electropherograms for high-throughput quantitative analysis of mixed microbial populations
    Trosvik, Pal
    Skanseng, Beate
    Jakobsen, Kjetill S.
    Stenseth, Nils C.
    Naes, Tormod
    Rudi, Knut
    APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2007, 73 (15) : 4975 - 4983
  • [4] A high-throughput distributed DNA sequence analysis and database system
    Inman, JT
    Flores, HR
    May, GD
    Weller, JW
    Bell, CJ
    IBM SYSTEMS JOURNAL, 2001, 40 (02) : 464 - 486
  • [5] Analysis of DNA sequence variants detected by high-throughput sequencing
    Adams, David R.
    Sincan, Murat
    Fajardo, Karin Fuentes
    Mullikin, James C.
    Pierson, Tyler M.
    Toro, Camilo
    Boerkoel, Cornelius F.
    Tifft, Cynthia J.
    Gahl, William A.
    Markello, Tom C.
    HUMAN MUTATION, 2012, 33 (04) : 599 - 608
  • [6] A Scalable High-Throughput Pipeline Architecture for DNA Sequence Alignment
    Ghosh, Surajeet
    Mandal, Sriparna
    Ray, Sanchita Saha
    TENCON 2015 - 2015 IEEE REGION 10 CONFERENCE, 2015,
  • [7] High-Throughput Sequence Analysis of Microbial Communities of Soybean in Northeast China
    Wang, Yuanyuan
    Bai, Qingyao
    Meng, Fanqi
    Dong, Wei
    Fan, Haiyan
    Zhu, Xiaofeng
    Duan, Yuxi
    Chen, Lijie
    AGRONOMY-BASEL, 2025, 15 (02):
  • [8] Perspectives and Benefits of High-Throughput Long-Read Sequencing in Microbial Ecology
    Tedersoo, Leho
    Albertsen, Math
    Anslan, Sten
    Callahan, Benjamin
    APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2021, 87 (17) : 1 - 19
  • [9] High-throughput DNA barcoding for ecological network studies
    Toju, Hirokazu
    POPULATION ECOLOGY, 2015, 57 (01) : 37 - 51
  • [10] DNA Sequence Recognition by DNA Primase Using High-Throughput Primase Profiling
    Ilic, Stefan
    Cohen, Shira
    Afek, Ariel
    Gordan, Raluca
    Lukatsky, David B.
    Akabayov, Barak
    JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2019, (152):