GET_HOMOLOGUES, a Versatile Software Package for Scalable and Robust Microbial Pangenome Analysis

被引:642
作者
Contreras-Moreira, Bruno [1 ,2 ]
Vinuesa, Pablo [3 ]
机构
[1] CSIC, EEAD, Zaragoza, Spain
[2] Fdn ARAID, Zaragoza, Spain
[3] Univ Nacl Autonoma Mexico, Ctr Ciencias Genom, Cuernavaca 62191, Morelos, Mexico
关键词
STREPTOCOCCUS-PNEUMONIAE; ORTHOLOGY; GENOMICS; DATABASE; BACTERIAL; CLUSTERS; SEQUENCE;
D O I
10.1128/AEM.02411-13
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
GET_HOMOLOGUES is an open-source software package that builds on popular orthology-calling approaches making highly customizable and detailed pangenome analyses of microorganisms accessible to nonbioinformaticians. It can cluster homologous gene families using the bidirectional best-hit, COGtriangles, or OrthoMCL clustering algorithms. Clustering stringency can be adjusted by scanning the domain composition of proteins using the HMMER3 package, by imposing desired pairwise alignment coverage cutoffs, or by selecting only syntenic genes. The resulting homologous gene families can be made even more robust by computing consensus clusters from those generated by any combination of the clustering algorithms and filtering criteria. Auxiliary scripts make the construction, interrogation, and graphical display of core genome and pangenome sets easy to perform. Exponential and binomial mixture models can be fitted to the data to estimate theoretical core genome and pangenome sizes, and high-quality graphics can be generated. Furthermore, pangenome trees can be easily computed and basic comparative genomics performed to identify lineage-specific genes or gene family expansions. The software is designed to take advantage of modern multiprocessor personal computers as well as computer clusters to parallelize time-consuming tasks. To demonstrate some of these capabilities, we survey a set of 50 Streptococcus genomes annotated in the Orthologous Matrix (OMA) browser as a benchmark case. The package can be downloaded at http://www.eead.csic.es/compbio/soft/gethoms.php and http://maya.ccg.unam.mx/soft/gethoms.php.
引用
收藏
页码:7696 / 7701
页数:6
相关论文
共 45 条
  • [1] Resolving the Ortholog Conjecture: Orthologs Tend to Be Weakly, but Significantly, More Similar in Function than Paralogs
    Altenhoff, Adrian M.
    Studer, Romain A.
    Robinson-Rechavi, Marc
    Dessimoz, Christophe
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2012, 8 (05)
  • [2] Altenhoff AM, 2012, METHODS MOL BIOL, V855, P259, DOI 10.1007/978-1-61779-582-4_9
  • [3] OMA 2011: orthology inference among 1000 complete genomes
    Altenhoff, Adrian M.
    Schneider, Adrian
    Gonnet, Gaston H.
    Dessimoz, Christophe
    [J]. NUCLEIC ACIDS RESEARCH, 2011, 39 : D289 - D294
  • [4] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [5] [Anonymous], 2004, Inferring phylogenies
  • [6] Screening of Streptococcus pneumoniae ABC Transporter Mutants Demonstrates that LivJHMGF, a Branched-Chain Amino Acid ABC Transporter, Is Necessary for Disease Pathogenesis
    Basavanna, Shilpa
    Khandavilli, Suneeta
    Yuste, Jose
    Cohen, Jonathan M.
    Hosie, Arthur H. F.
    Webb, Alexander J.
    Thomas, Gavin H.
    Brown, Jeremy S.
    [J]. INFECTION AND IMMUNITY, 2009, 77 (08) : 3412 - 3423
  • [7] BLAST plus : architecture and applications
    Camacho, Christiam
    Coulouris, George
    Avagyan, Vahram
    Ma, Ning
    Papadopoulos, Jason
    Bealer, Kevin
    Madden, Thomas L.
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [8] Computing prokaryotic gene ubiquity: Rescuing the core from extinction
    Charlebois, RL
    Doolittle, WF
    [J]. GENOME RESEARCH, 2004, 14 (12) : 2469 - 2477
  • [9] Structure and dynamics of the pan-genome of Streptococcus pneumoniae and closely related species
    Donati, Claudio
    Hiller, N. Luisa
    Tettelin, Herve
    Muzzi, Alessandro
    Croucher, Nicholas J.
    Angiuoli, Samuel V.
    Oggioni, Marco
    Hotopp, Julie C. Dunning
    Hu, Fen Z.
    Riley, David R.
    Covacci, Antonello
    Mitchell, Tim J.
    Bentley, Stephen D.
    Kilian, Morgens
    Ehrlich, Garth D.
    Rappuoli, Rino
    Moxon, E. Richard
    Masignani, Vega
    [J]. GENOME BIOLOGY, 2010, 11 (10):
  • [10] Felsenstein J., 2004, Phylip (phylogeny inference package) version 3.6