On suffix extensions in suffix trees

被引:9
作者
Breslauer, Dany [2 ]
Italiano, Giuseppe F. [1 ]
机构
[1] Univ Roma Tor Vergata, Dipartimento Informat Sistemi & Prod, Rome, Italy
[2] Univ Haifa, Caesarea Rothschild Inst Interdisciplinary Applic, IL-31999 Haifa, Israel
基金
欧洲研究理事会; 欧盟第七框架计划;
关键词
Suffix trees; String algorithms; LINEAR-TIME; MAINTAINING ORDER; CONSTRUCTION; ALGORITHMS; ARRAYS;
D O I
10.1016/j.tcs.2012.07.018
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Suffix trees are inherently asymmetric: prefix extensions only cause a few updates, while suffix extensions affect all suffixes causing a wave of updates. In his elegant linear-time on-line suffix tree algorithm Ukkonen relaxed the prevailing suffix tree representation and introduced two changes to avoid repeated structural updates and circumvent the inherent complexity of suffix extensions: (1) open ended edges that enjoy gratuitous leaf updates, and (2) the omission of implicit nodes. In this paper we study the implicit nodes as the suffix tree evolves. We partition the suffix tree's edges into collections of similar edges called bands, where implicit nodes exhibit identical behavior, and generalize the notion of open ended edges to allow implicit nodes to "float" within bands, only requiring updates when moving from one band to the next, adding up to only O(n) updates. We also show that internal implicit nodes are separated from each other by explicit suffix tree nodes and that all external implicit nodes are related to the same periodicity. These new properties may be used to keep track of the waves of implicit node updates and to build the suffix tree on-line in amortized linear time, providing access to all the implicit nodes in worst-case constant time. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:27 / 34
页数:8
相关论文
共 39 条
[1]   IMPROVED DYNAMIC DICTIONARY MATCHING [J].
AMIR, A ;
FARACH, M ;
IDURY, RM ;
LAPOUTRE, JA ;
SCHAFFER, AA .
INFORMATION AND COMPUTATION, 1995, 119 (02) :258-282
[2]  
[Anonymous], 1979, Transductions and context-free languages
[3]   OPTIMAL OFF-LINE DETECTION OF REPETITIONS IN A STRING [J].
APOSTOLICO, A ;
PREPARATA, FP .
THEORETICAL COMPUTER SCIENCE, 1983, 22 (03) :297-315
[4]  
Apostolico A., 1985, Combinatorial Algorithms on Words, P85
[5]  
Bender MA, 2002, LECT NOTES COMPUT SC, V2461, P152
[6]   THE SMALLEST AUTOMATION RECOGNIZING THE SUBWORDS OF A TEXT [J].
BLUMER, A ;
BLUMER, J ;
HAUSSLER, D ;
EHRENFEUCHT, A ;
CHEN, MT ;
SEIFERAS, J .
THEORETICAL COMPUTER SCIENCE, 1985, 40 (01) :31-55
[7]   COMPLETE INVERTED FILES FOR EFFICIENT TEXT RETRIEVAL AND ANALYSIS [J].
BLUMER, A ;
BLUMER, J ;
HAUSSLER, D ;
MCCONNELL, R ;
EHRENFEUCHT, A .
JOURNAL OF THE ACM, 1987, 34 (03) :578-595
[8]  
Breslauer D., 1996, Parallel Processing Letters, V6, P35, DOI 10.1142/S0129626496000054
[9]  
Breslauer D, 2011, LECT NOTES COMPUT SC, V7024, P301, DOI 10.1007/978-3-642-24583-1_30
[10]  
Breslauer D, 2011, LECT NOTES COMPUT SC, V7024, P156, DOI 10.1007/978-3-642-24583-1_16