A linear time algorithm for Shortest Cyclic Cover of Strings

被引:3
作者
Cazaux, Bastien [1 ,2 ,3 ,4 ]
Rivals, Eric [1 ,2 ,3 ,4 ]
机构
[1] CNRS, LIRMM, 161 Rue Ada, F-34095 Montpellier 5, France
[2] Univ Montpellier, 161 Rue Ada, F-34095 Montpellier 5, France
[3] CNRS, Inst Biol Computat, 860 Rue St Priest, F-34095 Montpellier 5, France
[4] Univ Montpellier, 860 Rue St Priest, F-34095 Montpellier 5, France
关键词
Greedy; Shortest cyclic cover of strings; Superstring; Overlap; Minimum assignment; Graph;
D O I
10.1016/j.jda.2016.05.001
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Merging words according to their overlap yields a superstring. This basic operation allows to infer long strings from a collection of short pieces, as in genome assembly. To capture a maximum of overlaps, the goal is to infer the shortest superstring of a set of input words. The Shortest Cyclic Cover of Strings (SCCS) problem asks, instead of a single linear superstring, for a set of cyclic strings that contain the words as substrings and whose sum of lengths is minimal. SCCS is used as a crucial step in polynomial time approximation algorithms for the notably hard Shortest Superstring problem, but it is solved in cubic time. The cyclic strings are then cut and merged to build a linear superstring. SCCS can also be solved by a greedy algorithm. Here, we propose a linear time algorithm for solving SCCS based on a Eulerian graph that captures all greedy solutions in linear space. Because the graph is Eulerian, this algorithm can also find a greedy solution of SCCS with the least number of cyclic strings. This has implications for solving certain instances of the Shortest linear or cyclic Superstring problems. (C) 2016 The Author(s). Published by Elsevier B.V.
引用
收藏
页码:56 / 67
页数:12
相关论文
共 20 条
  • [1] Blum A., 1991, ACM S THEOR COMP, P328
  • [2] Rotations of periodic strings and short superstrings
    Breslauer, D
    Jiang, T
    Jiang, ZG
    [J]. JOURNAL OF ALGORITHMS, 1997, 24 (02) : 340 - 353
  • [3] On suffix extensions in suffix trees
    Breslauer, Dany
    Italiano, Giuseppe F.
    [J]. THEORETICAL COMPUTER SCIENCE, 2012, 457 : 27 - 34
  • [4] Cazaux B., 2015, DISCRETE APPL MATH
  • [5] Cazaux B., 2014, P PRAG STRING C PSC, P148
  • [6] Reverse engineering of compact suffix trees and links: A novel algorithm
    Cazaux, Bastien
    Rivals, Eric
    [J]. JOURNAL OF DISCRETE ALGORITHMS, 2014, 28 : 9 - 22
  • [7] ON FINDING MINIMAL LENGTH SUPERSTRINGS
    GALLANT, J
    MAIER, D
    STORER, JA
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1980, 20 (01) : 50 - 58
  • [8] Golovnev A, 2013, LECT NOTES COMPUT SC, V7922, P120, DOI 10.1007/978-3-642-38905-4_13
  • [9] AN EFFICIENT ALGORITHM FOR THE ALL PAIRS SUFFIX PREFIX PROBLEM
    GUSFIELD, D
    LANDAU, GM
    SCHIEBER, B
    [J]. INFORMATION PROCESSING LETTERS, 1992, 41 (04) : 181 - 185
  • [10] FASTER IMPLEMENTATION OF A SHORTEST SUPERSTRING APPROXIMATION
    GUSFIELD, D
    [J]. INFORMATION PROCESSING LETTERS, 1994, 51 (05) : 271 - 274