PanDelos-frags: A methodology for discovering pangenomic content of incomplete microbial assemblies

被引:3
作者
Bonnici, Vincenzo [1 ]
Mengoni, Claudia [2 ]
Mangoni, Manuel [3 ,4 ]
Franco, Giuditta [2 ]
Giugno, Rosalba [2 ]
机构
[1] Univ Parma, Dept Math Phys & Comp Sci, Parco Area Sci 53-a Campus, I-43124 Parma, PR, Italy
[2] Univ Verona, Dept Comp Sci, Str Grazie 15, I-37134 Verona, VR, Italy
[3] Fdn IRCCS Casa Sollievo Sofferenza, I-71013 San Giovanni Rotondo, FG, Italy
[4] Sapienza Univ Rome, Dept Expt Med, Rome, RM, Italy
关键词
Pangenome; Gene families; Sequence homology; Fragmented genomes; Computational approach; GENE-TRANSFER; PAN-GENOME; ALIGNMENT; ALGORITHM; VACCINES;
D O I
10.1016/j.jbi.2023.104552
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Pangenomics was originally defined as the problem of comparing the composition of genes into gene families within a set of bacterial isolates belonging to the same species. The problem requires the calculation of sequence homology among such genes. When combined with metagenomics, namely for human microbiome composition analysis, gene-oriented pangenome detection becomes a promising method to decipher ecosystem functions and population-level evolution.Established computational tools are able to investigate the genetic content of isolates for which a complete genomic sequence is available. However, there is a plethora of incomplete genomes that are available on public resources, which only a few tools may analyze. Incomplete means that the process for reconstructing their genomic sequence is not complete, and only fragments of their sequence are currently available. However, the information contained in these fragments may play an essential role in the analyses.Here, we present PanDelos-frags, a computational tool which exploits and extends previous results in analyzing complete genomes. It provides a new methodology for inferring missing genetic information and thus for managing incomplete genomes. PanDelos-frags outperforms state-of-the-art approaches in reconstructing gene families in synthetic benchmarks and in a real use case of metagenomics. PanDelos-frags is publicly available at https://github.com/InfOmics/PanDelos-frags.
引用
收藏
页数:17
相关论文
共 55 条
[1]   The evolution of bacterial genome assemblies - where do we need to go next? [J].
Altermann, Eric ;
Tegetmeyer, Halina E. ;
Chanyi, Ryan M. .
MICROBIOME RESEARCH REPORTS, 2022, 1 (03)
[2]   Interest of bacterial pangenome analyses in clinical microbiology [J].
Anani, Hussein ;
Zgheib, Rita ;
Hasni, Issam ;
Raoult, Didier ;
Fournier, Pierre-Edouard .
MICROBIAL PATHOGENESIS, 2020, 149
[3]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[4]  
Barbosa Eudes Gv, 2014, World J Biol Chem, V5, P161, DOI 10.4331/wjbc.v5.i2.161
[5]   Spectral concepts in genome informational analysis [J].
Bonnici, V. ;
Franco, G. ;
Manca, V. .
THEORETICAL COMPUTER SCIENCE, 2021, 894 :23-30
[6]  
Bonnici V., 2016, Sci. Rep, V6, P1
[7]   Challenges in gene-oriented approaches for pangenome content discovery [J].
Bonnici, Vincenzo ;
Maresi, Emiliano ;
Giugno, Rosalba .
BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
[8]   A k-mer Based Sequence Similarity for Pangenomic Analyses [J].
Bonnici, Vincenzo ;
Cracco, Andrea ;
Franco, Giuditta .
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE (LOD 2021), PT II, 2022, 13164 :31-44
[9]   PANPROVA: PANgenomic PROkaryotic eVolution of full Assemblies [J].
Bonnici, Vincenzo ;
Giugno, Rosalba .
BIOINFORMATICS, 2022, 38 (09) :2631-2632
[10]   PanDelos: a dictionary-based method for pan-genome content discovery [J].
Bonnici, Vincenzo ;
Giugno, Rosalba ;
Manca, Vincenzo .
BMC BIOINFORMATICS, 2018, 19