Comparing incomplete sequences via longest common subsequence

被引:2
作者
Castelli, Mauro [1 ]
Dondi, Riccardo [2 ]
Mauri, Giancarlo [3 ]
Zoppis, Italo [3 ]
机构
[1] Univ Nova Lisboa, NOVA Informat Management Sch NOVA IMS, Campus Campolide, P-1070312 Lisbon, Portugal
[2] Univ Bergamo, Dipartimento Sci Umane & Sociali, Bergamo, Italy
[3] Univ Milano Bicocca, Dipartimento Informat Sistemist & Comunicaz, Milan, Italy
关键词
Longest common subsequence; String algorithms; Approximation algorithms; Computational complexity; Fixed-parameter algorithms; APPROXIMATION; ALGORITHM;
D O I
10.1016/j.tcs.2019.09.022
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Inspired by scaffold filling, a recent approach for genome reconstruction from incomplete data, we consider a variant of the well-known longest common subsequence problem for the comparison of two sequences. The new problem, called Longest Filled Common Subsequence, aims to compare a complete sequence with an incomplete one, i.e. with some missing elements. Longest Filled Common Subsequence (LFCS), given a complete sequence A, an incomplete sequence B, and a multiset M of symbols missing in B, asks for a sequence B* obtained by inserting the symbols of M into B so that B* induces a common subsequence with A of maximum length. We investigate the computational and approximation complexity of the problem and we show that it is NP-hard and APX-hard when A contains at most two occurrences of each symbol, and we give a polynomial time algorithm when the input sequences are over a constant-size alphabet. We give a 3/5-approximation algorithm for the Longest Filled Common Subsequence problem. Finally, we present a fixed-parameter algorithm for the problem, when it is parameterized by the number of symbols inserted in B that "match" symbols of A. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:272 / 285
页数:14
相关论文
共 21 条
  • [1] Repetition-free longest common subsequence
    Adi, Said S.
    Braga, Marilia D. V.
    Fernandes, Cristina G.
    Ferreira, Carlos E.
    Martinez, Fabio Viduani
    Sagot, Marie-France
    Stefanes, Marco A.
    Tjandraatmadja, Christian
    Wakabayashi, Yoshiko
    [J]. DISCRETE APPLIED MATHEMATICS, 2010, 158 (12) : 1315 - 1324
  • [2] Some APX-completeness results for cubic graphs
    Alimonti, P
    Kann, V
    [J]. THEORETICAL COMPUTER SCIENCE, 2000, 237 (1-2) : 123 - 134
  • [3] COLOR-CODING
    ALON, N
    YUSTER, R
    ZWICK, U
    [J]. JOURNAL OF THE ASSOCIATION FOR COMPUTING MACHINERY, 1995, 42 (04): : 844 - 856
  • [4] Algorithms for the constrained longest common subsequence problems
    Arslan, AN
    Egecioglu, Ö
    [J]. INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2005, 16 (06) : 1099 - 1109
  • [5] Ausiello Giorgio, 1999, COMPLEXITY APPROXIMA, DOI DOI 10.1007/978-3-642-58412-1
  • [6] On the parameterized complexity of the repetition free longest common subsequence problem
    Blin, Guillaume
    Bonizzoni, Paola
    Dondi, Riccardo
    Sikora, Florian
    [J]. INFORMATION PROCESSING LETTERS, 2012, 112 (07) : 272 - 276
  • [7] Exemplar longest common subsequence
    Bonizzoni, Paola
    Della Vedova, Gianluca
    Dondi, Riccardo
    Fertin, Guillaume
    Rizzi, Raffaella
    Vialette, Stephane
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2007, 4 (04) : 535 - 543
  • [8] Variants of constrained longest common subsequence
    Bonizzoni, Paola
    Della Vedova, Gianluca
    Dondi, Riccardo
    Pirola, Yuri
    [J]. INFORMATION PROCESSING LETTERS, 2010, 110 (20) : 877 - 881
  • [9] Bulteau L., 2017, LIPICS, V78
  • [10] Fixed-parameter algorithms for scaffold filling
    Bulteau, Laurent
    Carrieri, Anna Paola
    Dondi, Riccardo
    [J]. THEORETICAL COMPUTER SCIENCE, 2015, 568 : 72 - 83