BPGA- an ultra-fast pan-genome analysis pipeline

被引:745
作者
Chaudhari, Narendrakumar M. [1 ]
Gupta, Vinod Kumar [1 ]
Dutta, Chitra [1 ]
机构
[1] Indian Inst Chem Biol, CSIR, Struct Biol & Bioinformat Div, 4 Raja SC Mullick Rd, Kolkata 700032, India
来源
SCIENTIFIC REPORTS | 2016年 / 6卷
关键词
STREPTOCOCCUS-PNEUMONIAE; SEQUENCE; IDENTIFICATION; REVEALS; STRAINS; CORE; PANGENOME; EVOLUTION; INSIGHTS; VACCINE;
D O I
10.1038/srep24373
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Recent advances in ultra-high-throughput sequencing technology and metagenomics have led to a paradigm shift in microbial genomics from few genome comparisons to large-scale pan-genome studies at different scales of phylogenetic resolution. Pan-genome studies provide a framework for estimating the genomic diversity of the dataset, determining core (conserved), accessory (dispensable) and unique (strain-specific) gene pool of a species, tracing horizontal gene-flux across strains and providing insight into species evolution. The existing pan genome software tools suffer from various limitations like limited datasets, difficult installation/requirements, inadequate functional features etc. Here we present an ultra-fast computational pipeline BPGA (Bacterial Pan Genome Analysis tool) with seven functional modules. In addition to the routine pan genome analyses, BPGA introduces a number of novel features for downstream analyses like core/pan/MLST (Multi Locus Sequence Typing) phylogeny, exclusive presence/absence of genes in specific strains, subset analysis, atypical G + C content analysis and KEGG & COG mapping of core, accessory and unique genes. Other notable features include minimum running prerequisites, freedom to select the gene clustering method, ultra-fast execution, user friendly command line interface and high-quality graphics outputs. The performance of BPGA has been evaluated using a dataset of complete genome sequences of 28 Streptococcus pyogenes strains.
引用
收藏
页数:10
相关论文
共 43 条
[21]   OrthoMCL: Identification of ortholog groups for eukaryotic genomes [J].
Li, L ;
Stoeckert, CJ ;
Roos, DS .
GENOME RESEARCH, 2003, 13 (09) :2178-2189
[22]   Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences [J].
Li, Weizhong ;
Godzik, Adam .
BIOINFORMATICS, 2006, 22 (13) :1658-1659
[23]   De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits [J].
Li, Ying-hui ;
Zhou, Guangyu ;
Ma, Jianxin ;
Jiang, Wenkai ;
Jin, Long-guo ;
Zhang, Zhouhao ;
Guo, Yong ;
Zhang, Jinbo ;
Sui, Yi ;
Zheng, Liangtao ;
Zhang, Shan-shan ;
Zuo, Qiyang ;
Shi, Xue-hui ;
Li, Yan-fei ;
Zhang, Wan-ke ;
Hu, Yiyao ;
Kong, Guanyi ;
Hong, Hui-long ;
Tan, Bing ;
Song, Jian ;
Liu, Zhang-xiong ;
Wang, Yaoshen ;
Ruan, Hang ;
Yeung, Carol K. L. ;
Liu, Jian ;
Wang, Hailong ;
Zhang, Li-juan ;
Guan, Rong-xia ;
Wang, Ke-jing ;
Li, Wen-bin ;
Chen, Shou-yi ;
Chang, Ru-zhen ;
Jiang, Zhi ;
Jackson, Scott A. ;
Li, Ruiqiang ;
Qiu, Li-juan .
NATURE BIOTECHNOLOGY, 2014, 32 (10) :1045-+
[24]   In Silico Prediction of Horizontal Gene Transfer Events in Lactobacillus bulgaricus and Streptococcus thermophilus Reveals Protocooperation in Yogurt Manufacturing [J].
Liu, Mengjin ;
Siezen, Roland J. ;
Nauta, Arjen .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2009, 75 (12) :4120-4129
[25]   Multilocus sequence typing: A portable approach to the identification of clones within populations of pathogenic microorganisms [J].
Maiden, MCJ ;
Bygraves, JA ;
Feil, E ;
Morelli, G ;
Russell, JE ;
Urwin, R ;
Zhang, Q ;
Zhou, JJ ;
Zurth, K ;
Caugant, DA ;
Feavers, IM ;
Achtman, M ;
Spratt, BG .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (06) :3140-3145
[26]   Identification of a universal group B Streptococcus vaccine by multiple genome screen [J].
Maione, D ;
Margarit, I ;
Rinaudo, CD ;
Masignani, V ;
Mora, M ;
Scarselli, M ;
Tettelin, H ;
Brettoni, C ;
Iacobini, ET ;
Rosini, R ;
D'Agostino, N ;
Miorin, L ;
Buccato, S ;
Mariani, M ;
Galli, G ;
Nogarotto, R ;
Nardi-Dei, V ;
Vegni, F ;
Fraser, C ;
Mancuso, G ;
Teti, G ;
Madoff, LC ;
Paoletti, LC ;
Rappuoli, R ;
Kasper, DL ;
Telford, JL ;
Grandi, G .
SCIENCE, 2005, 309 (5731) :148-150
[27]   Extreme genome reduction in symbiotic bacteria [J].
McCutcheon, John P. ;
Moran, Nancy A. .
NATURE REVIEWS MICROBIOLOGY, 2012, 10 (01) :13-26
[28]   The microbial pan-genome [J].
Medini, D ;
Donati, C ;
Tettelin, H ;
Masignani, V ;
Rappuoli, R .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2005, 15 (06) :589-594
[29]   The bacterial pan-genome: a new paradigm in microbiology [J].
Mira, Alex ;
Martin-Cuadrado, Ana B. ;
D'Auria, Giuseppe ;
Rodriguez-Valera, Francisco .
INTERNATIONAL MICROBIOLOGY, 2010, 13 (02) :45-57
[30]   Roary: rapid large-scale prokaryote pan genome analysis [J].
Page, Andrew J. ;
Cummins, Carla A. ;
Hunt, Martin ;
Wong, Vanessa K. ;
Reuter, Sandra ;
Holden, Matthew T. G. ;
Fookes, Maria ;
Falush, Daniel ;
Keane, Jacqueline A. ;
Parkhill, Julian .
BIOINFORMATICS, 2015, 31 (22) :3691-3693