USING MULTISET DISCRIMINATION TO SOLVE LANGUAGE PROCESSING PROBLEMS WITHOUT HASHING

被引:21
作者
CAI, JZ [1 ]
PAIGE, R [1 ]
机构
[1] NYU, COURANT INST, DEPT COMP SCI, NEW YORK, NY 10012 USA
基金
美国国家科学基金会;
关键词
D O I
10.1016/0304-3975(94)00183-J
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is generally assumed that hashing is essential to solve many language processing problems efficiently; e.g. symbol table formation and maintenance, grammar manipulation, basic block optimization, and global optimization. This paper questions this assumption, and initiates development of an efficient alternative compiler methodology without hashing or sorting. The methodology rests on efficient solutions to the basic problem of detecting duplicate values in a multiset, which we call multiset discrimination. Paige and Tarjan (1987) gave an efficient solution to multiset discrimination for detecting duplicate elements occurring in a multiset of varying length strings. The technique was used to develop an improved algorithm for lexicographic sorting, whose importance stems largely from its use in solving a variety of isomorphism problems (Aho et al., 1974), The current paper and a related paper (Paige, 1994) show that full lexicographic sorting is not needed to solve these isomorphism problems, because they can be solved more efficiently using straightforward extensions to the simpler multiset discrimination technique. By reformulating language processing problems in terms of multiset discrimination, we also show how almost every subtask of compilation can be solved without hashing in worst case running time no worse (and frequently better) than the best previous expected time solution (under the assumption that one hash operation takes unit expected time), Because of their simplicity, our solutions may be of practical as well as theoretical interest. The various applications presented culminate with a new algorithm to solve iterated strength reduction folded with useless code elimination that runs in worst case asymptotic time and auxiliary space Theta(/L/ + /L'/), where /L/ and /L'/ represent the lengths of the initial and optimized programs, respectively. The previous best solution due to Cocke and Kennedy (1977) takes Omega(/L/(3)/L'/) has operations in the worst case.
引用
收藏
页码:189 / 228
页数:40
相关论文
共 41 条
[1]  
Aho A. V., 1986, COMPILERS
[2]  
Aho A. V., 1974, DESIGN ANAL COMPUTER
[3]  
Allen F. E., 1981, Program flow analysis. Theory and applications, P79
[4]  
ALPERN B, 1988, 15TH P ACM POPL
[5]  
CARTER L, 1979, J CSS, V18, P143, DOI DOI 10.1016/0022-0000(79)90044-8
[6]  
CLINGER W, 1991, 8TH P ACM S PRINC PR, P155
[7]   ALGORITHM FOR REDUCTION OF OPERATOR STRENGTH [J].
COCKE, J ;
KENNEDY, K .
COMMUNICATIONS OF THE ACM, 1977, 20 (11) :850-856
[8]  
COCKE J, 1969, LECTURE NOTES CIMS
[9]  
Cocke John, 1970, ACM SIGPLAN NOTICES, V5, DOI [10.1145/390013.808480, DOI 10.1145/390013.808480]
[10]  
CYTRON R, 1985, CODE MOTION CONTROL