Variations of the parameterized longest previous factor

被引:6
作者
Beal, Richard [1 ]
Adjeroh, Donald [1 ]
机构
[1] West Virginia Univ, Lane Dept Comp Sci & Elect Engn, Morgantown, WV 26506 USA
关键词
Parameterized suffix array; Parameterized longest common prefix; p-string; p-match; p-border array; LPF; LCP; Permuted-LCP; Border array;
D O I
10.1016/j.jda.2012.05.004
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The parameterized longest previous factor (pLPF) problem as defined for parameterized strings (p-strings) adds a level of parameterization to the longest previous factor (LPF) problem originally defined for traditional strings. In this work, we consider the construction of the pLPF data structure and identify the strong relationship between the pLPF linear time construction and several variations of the problem. Initially, we propose a taxonomy of classes for longest factor problems. Using this taxonomy, we show an interesting connection between the pLPF and popular data structures. It is shown that a subset of longest factor problems may be created with the pLPF construction. More specifically, the pLPF problem is used as a foundation to achieve the linear time construction of popular data structures such as the LCP, parameterized-LCP (pLCP), parameterized-border (p-border) array, and border array. We further generalize the permuted-LCP for p-strings and provide a linear time construction. A number of new variations of the pLPF problem are proposed and addressed in linear time for both p-strings and traditional strings, including the longest not-equal factor (LneF), longest reverse factor (LrF), and longest factor (LF). The framework of the pLPF construction is exploited to efficiently address a multitude of data structures with prospects in various applications. Finally, we implement our algorithms and perform various experiments to confirm theoretical results. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:129 / 150
页数:22
相关论文
共 41 条
[1]  
Adjeroh D, 2004, 2004 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, P676
[2]   DNA sequence compression using the Burrows-Wheeler Transform [J].
Adjeroh, D ;
Zhang, Y ;
Mukherjee, A ;
Powell, M ;
Bell, T .
CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE, 2002, :303-313
[3]   ALPHABET DEPENDENCE IN PARAMETERIZED MATCHING [J].
AMIR, A ;
FARACH, M ;
MUTHUKRISHNAN, S .
INFORMATION PROCESSING LETTERS, 1994, 49 (03) :111-115
[4]  
[Anonymous], 2008, BURROWS WHEELER TRAN, DOI [DOI 10.1007/978-0-387-78909-5, 10.1007/978-0-387-78909-5]
[5]  
Baker B. S., 1993, Proceedings of the Twenty-Fifth Annual ACM Symposium on the Theory of Computing, P71, DOI 10.1145/167088.167115
[6]   Finding clones with dup: Analysis of an experiment [J].
Baker, Brenda S. .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2007, 33 (09) :608-621
[7]   Parameterized pattern matching: Algorithms and applications [J].
Baker, BS .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1996, 52 (01) :28-42
[8]  
BAKER BS, 1995, PROCEEDINGS OF THE SIXTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P541
[9]  
Beal R. A., 2011, THESIS
[10]   p-Suffix sorting as arithmetic coding [J].
Beal, Richard ;
Adjeroh, Donald .
JOURNAL OF DISCRETE ALGORITHMS, 2012, 16 :151-169