Alphabet-Independent Compressed Text Indexing

被引:39
作者
Belazzougui, Djamal [1 ]
Navarro, Gonzalo [2 ]
机构
[1] Univ Paris Diderot, Paris, France
[2] Univ Chile, Santiago, Chile
关键词
Algorithms; Compression; text indexing; succinct data structures; suffix trees; SUFFIX TREE CONSTRUCTION; ARRAYS;
D O I
10.1145/2635816
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Self-indexes are able to represent a text asymptotically within the information-theoretic lower bound under the kth order entropy model and offer access to any text substring and indexed pattern searches. Their time complexities are not optimal, however; in particular, they are always multiplied by a factor that depends on the alphabet size. In this article, we achieve, for the first time, full alphabet independence in the time complexities of self-indexes while retaining space optimality. We also obtain some relevant byproducts.
引用
收藏
页数:19
相关论文
共 43 条
[1]  
[Anonymous], 1997, ACM SIGACT NEWS
[2]  
Apostolico A., 1985, Combinatorial Algorithms on Words, P85
[3]   Fast text searching for regular expressions or automaton searching on tries [J].
BaezaYates, RA ;
Gonnet, GH .
JOURNAL OF THE ACM, 1996, 43 (06) :915-936
[4]  
Barbay J, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P680
[5]  
Belazzougui D., 2009, P 10 ALENEX
[6]  
Belazzougui D., 2012, P 20 ESA, P181
[7]  
Belazzougui D, 2009, PROCEEDINGS OF THE TWENTIETH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P785
[8]  
Beller T, 2011, LECT NOTES COMPUT SC, V7024, P197, DOI 10.1007/978-3-642-24583-1_20
[9]  
Burrows M, 1994, BLOCK SORTING LOSSLE
[10]  
Clark D.R., 1996, THESIS U WATERLOO CA