PanDelos-frags: A methodology for discovering pangenomic content of incomplete microbial assemblies

被引:3
作者
Bonnici, Vincenzo [1 ]
Mengoni, Claudia [2 ]
Mangoni, Manuel [3 ,4 ]
Franco, Giuditta [2 ]
Giugno, Rosalba [2 ]
机构
[1] Univ Parma, Dept Math Phys & Comp Sci, Parco Area Sci 53-a Campus, I-43124 Parma, PR, Italy
[2] Univ Verona, Dept Comp Sci, Str Grazie 15, I-37134 Verona, VR, Italy
[3] Fdn IRCCS Casa Sollievo Sofferenza, I-71013 San Giovanni Rotondo, FG, Italy
[4] Sapienza Univ Rome, Dept Expt Med, Rome, RM, Italy
关键词
Pangenome; Gene families; Sequence homology; Fragmented genomes; Computational approach; GENE-TRANSFER; PAN-GENOME; ALIGNMENT; ALGORITHM; EVOLUTION; VACCINES;
D O I
10.1016/j.jbi.2023.104552
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Pangenomics was originally defined as the problem of comparing the composition of genes into gene families within a set of bacterial isolates belonging to the same species. The problem requires the calculation of sequence homology among such genes. When combined with metagenomics, namely for human microbiome composition analysis, gene-oriented pangenome detection becomes a promising method to decipher ecosystem functions and population-level evolution.Established computational tools are able to investigate the genetic content of isolates for which a complete genomic sequence is available. However, there is a plethora of incomplete genomes that are available on public resources, which only a few tools may analyze. Incomplete means that the process for reconstructing their genomic sequence is not complete, and only fragments of their sequence are currently available. However, the information contained in these fragments may play an essential role in the analyses.Here, we present PanDelos-frags, a computational tool which exploits and extends previous results in analyzing complete genomes. It provides a new methodology for inferring missing genetic information and thus for managing incomplete genomes. PanDelos-frags outperforms state-of-the-art approaches in reconstructing gene families in synthetic benchmarks and in a real use case of metagenomics. PanDelos-frags is publicly available at https://github.com/InfOmics/PanDelos-frags.
引用
收藏
页数:17
相关论文
共 55 条
[21]   Community structure in social and biological networks [J].
Girvan, M ;
Newman, MEJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (12) :7821-7826
[22]   ETE 3: Reconstruction, Analysis, and Visualization of Phylogenomic Data [J].
Huerta-Cepas, Jaime ;
Serra, Francois ;
Bork, Peer .
MOLECULAR BIOLOGY AND EVOLUTION, 2016, 33 (06) :1635-1638
[23]   Structure, function and diversity of the healthy human microbiome [J].
Huttenhower, Curtis ;
Gevers, Dirk ;
Knight, Rob ;
Abubucker, Sahar ;
Badger, Jonathan H. ;
Chinwalla, Asif T. ;
Creasy, Heather H. ;
Earl, Ashlee M. ;
FitzGerald, Michael G. ;
Fulton, Robert S. ;
Giglio, Michelle G. ;
Hallsworth-Pepin, Kymberlie ;
Lobos, Elizabeth A. ;
Madupu, Ramana ;
Magrini, Vincent ;
Martin, John C. ;
Mitreva, Makedonka ;
Muzny, Donna M. ;
Sodergren, Erica J. ;
Versalovic, James ;
Wollam, Aye M. ;
Worley, Kim C. ;
Wortman, Jennifer R. ;
Young, Sarah K. ;
Zeng, Qiandong ;
Aagaard, Kjersti M. ;
Abolude, Olukemi O. ;
Allen-Vercoe, Emma ;
Alm, Eric J. ;
Alvarado, Lucia ;
Andersen, Gary L. ;
Anderson, Scott ;
Appelbaum, Elizabeth ;
Arachchi, Harindra M. ;
Armitage, Gary ;
Arze, Cesar A. ;
Ayvaz, Tulin ;
Baker, Carl C. ;
Begg, Lisa ;
Belachew, Tsegahiwot ;
Bhonagiri, Veena ;
Bihan, Monika ;
Blaser, Martin J. ;
Bloom, Toby ;
Bonazzi, Vivien ;
Brooks, J. Paul ;
Buck, Gregory A. ;
Buhay, Christian J. ;
Busam, Dana A. ;
Campbell, Joseph L. .
NATURE, 2012, 486 (7402) :207-214
[24]   Prodigal: prokaryotic gene recognition and translation initiation site identification [J].
Hyatt, Doug ;
Chen, Gwo-Liang ;
LoCascio, Philip F. ;
Land, Miriam L. ;
Larimer, Frank W. ;
Hauser, Loren J. .
BMC BIOINFORMATICS, 2010, 11
[25]   MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability [J].
Katoh, Kazutaka ;
Standley, Daron M. .
MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (04) :772-780
[26]   Gene fragmentation in bacterial draft genomes: extent, consequences and mitigation [J].
Klassen, Jonathan L. ;
Currie, Cameron R. .
BMC GENOMICS, 2012, 13
[27]  
Li H, 2009, BIOINFORMATICS, V25, P1094, DOI [10.1093/bioinformatics/btp324, 10.1093/bioinformatics/btp100]
[28]   Critical assessment of pan-genomic analysis of metagenome-assembled genomes [J].
Li, Tang ;
Yin, Yanbin .
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (06)
[29]  
Lothaire M., 2005, Applied combinatorics on words
[30]  
Medini D, 2020, PANGENOME: DIVERSITY, DYNAMICS AND EVOLUTION OF GENOMES, P3, DOI 10.1007/978-3-030-38281-0_1