A Linear-Time Algorithm for Seeds Computation

被引:10
|
作者
Kociumaka, Tomasz [1 ,2 ]
Kubica, Marcin [1 ]
Radoszewski, Jakub [1 ]
Rytter, Wojciech [1 ]
Walen, Tomasz [1 ]
机构
[1] Univ Warsaw, Inst Informat, Banacha 2, PL-02097 Warsaw, Poland
[2] Bar Ilan Univ, Dept Comp Sci, IL-5290002 Ramat Gan, Israel
基金
欧盟地平线“2020”;
关键词
Seed of a word; quasiperiodicity; suffix tree; subword complexity; COVER; QUASIPERIODICITIES; ARRAY;
D O I
10.1145/3386369
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
A seed in a word is a relaxed version of a period in which the occurrences of the repeating subword may overlap. Our first contribution is a linear-time algorithm computing a linear-size representation of all seeds of a word (the number of seeds might be quadratic). In particular, one can easily derive the shortest seed and the number of seeds from our representation. Thus, we solve an open problem stated in a survey by Smyth from 2000 and improve upon a previous O(n logn)-time algorithm by Iliopoulos et al. from 1996. Our approach is based on combinatorial relations between seeds and subword complexity (used here for the first time in the context of seeds). In previous papers, compact representations of seeds consisted of two independent parts operating on the suffix tree of the input word and the suffix tree of its reverse, respectively. Our second contribution is a novel and significantly simpler representation of all seeds that avoids dealing with the suffix tree of the reversed word. This result is also of independent interest from a combinatorial point of view. A preliminary version of this work, with a much more complex algorithm constructing a representation of seeds on two suffix trees, was presented at the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA'12).
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Linear-time computation of local periods
    Duval, JP
    Kolpakov, R
    Kucherov, G
    Lecroq, T
    Lefebvre, A
    MATHEMATICAL FOUNDATIONS OF COMPUTER SCIENCE 2003, PROCEEDINGS, 2003, 2747 : 388 - 397
  • [2] Linear-time computation of local periods
    Duval, JP
    Kolpakov, R
    Kucherov, G
    Lecroq, T
    Lefebvre, A
    THEORETICAL COMPUTER SCIENCE, 2004, 326 (1-3) : 229 - 240
  • [3] Linear-time algorithm for the generation of trees
    Algorithmica (New York), 1997, 17 (02):
  • [4] A linear-time majority tree algorithm
    Amenta, N
    Clarke, F
    John, KS
    ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2003, 2812 : 216 - 227
  • [5] A linear-time algorithm for the generation of trees
    L. Alonso
    J. L. Rémy
    R. Schott
    Algorithmica, 1997, 17 : 162 - 182
  • [6] A LINEAR-TIME ALGORITHM FOR FINDING AN AMBITUS
    MISHRA, B
    TARJAN, RE
    ALGORITHMICA, 1992, 7 (5-6) : 521 - 554
  • [7] A linear-time algorithm for the generation of trees
    Alonso, L
    Remy, JL
    Schott, R
    ALGORITHMICA, 1997, 17 (02) : 162 - 182
  • [8] Linear-time computation of similarity measures for sequential data
    Rieck, Konrad
    Laskov, Pavel
    JOURNAL OF MACHINE LEARNING RESEARCH, 2008, 9 : 23 - 48
  • [9] LINEAR-TIME COMPUTATION OF OPTIMAL SUBGRAPHS OF DECOMPOSABLE GRAPHS
    BERN, MW
    LAWLER, EL
    WONG, AL
    JOURNAL OF ALGORITHMS, 1987, 8 (02) : 216 - 235
  • [10] Linear-Time Computation of Prefix Table for Weighted Strings
    Barton, Carl
    Pissis, Solon P.
    COMBINATORICS ON WORDS, WORDS 2015, 2015, 9304 : 73 - 84