Genome alignment with graph data structures: a comparison

被引:29
|
作者
Kehr, Birte [1 ,2 ]
Trappe, Kathrin [1 ]
Holtgrewe, Manuel [1 ]
Reinert, Knut [1 ]
机构
[1] Free Univ Berlin, Dept Comp Sci, D-14195 Berlin, Germany
[2] Max Planck Inst Mol Genet, D-14195 Berlin, Germany
来源
BMC BIOINFORMATICS | 2014年 / 15卷
关键词
MULTIPLE SEQUENCE ALIGNMENT; ALGORITHM; PERMUTATIONS; ACCURACY;
D O I
10.1186/1471-2105-15-99
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results: We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion: We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Evaluating Statistical Multiple Sequence Alignment in Comparison to Other Alignment Methods on Protein Data Sets
    Nute, Michael
    Saleh, Ehsan
    Warnow, Tandy
    SYSTEMATIC BIOLOGY, 2019, 68 (03) : 396 - 411
  • [2] Geometric graph comparison from an alignment viewpoint
    Prakash, Surya
    Robles-Kelly, Antonio
    PATTERN RECOGNITION, 2012, 45 (10) : 3780 - 3794
  • [3] Fifty years of graph matching, network alignment and network comparison
    Emmert-Streib, Frank
    Dehmer, Matthias
    Shi, Yongtang
    INFORMATION SCIENCES, 2016, 346 : 180 - 197
  • [4] Comparison of alignment free string distances for complete genome phylogeny
    Guyon, Frederic
    Brochier-Armanet, Celine
    Guenoche, Alain
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2009, 3 (02) : 95 - 108
  • [5] SEGA: Semiglobal Graph Alignment for Structure-Based Protein Comparison
    Mernberger, Marco
    Klebe, Gerhard
    Huellermeier, Eyke
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (05) : 1330 - 1343
  • [6] Wasserstein-Based Graph Alignment
    Maretic, Hermina Petric
    El Gheche, Mireille
    Minder, Matthias
    Chierchia, Giovanni
    Frossard, Pascal
    IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2022, 8 : 353 - 363
  • [7] On the Complexity of Sequence-to-Graph Alignment
    Jain, Chirag
    Zhang, Haowen
    Gao, Yu
    Aluru, Srinivas
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2020, 27 (04) : 640 - 654
  • [8] Comprehensive survey of conserved RNA secondary structures in full-genome alignment of Hepatitis C virus
    Triebel, Sandra
    Lamkiewicz, Kevin
    Ontiveros, Nancy
    Sweeney, Blake
    Stadler, Peter F.
    Petrov, Anton I.
    Niepmann, Michael
    Marz, Manja
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [9] A Theoretical Model for Whole Genome Alignment
    Belal, Nahla A.
    Heath, Lenwood S.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2011, 18 (05) : 705 - 728
  • [10] Subspace clustering based on alignment and graph embedding
    Liao, Mengmeng
    Gu, Xiaodong
    KNOWLEDGE-BASED SYSTEMS, 2020, 188