On the minimization of XPath queries

被引:20
作者
Flesca, S. [1 ]
Furfaro, F. [1 ]
Masciari, E. [2 ]
机构
[1] Univ Calabria, DEIS, I-87036 Arcavacata Di Rende, CS, Italy
[2] ICAR CNR, I-87036 Arcavacata Di Rende, CS, Italy
关键词
languages; theory; query containment; query minimization; tree pattern matching; XPath expressions;
D O I
10.1145/1326554.1326556
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
XPath expressions define navigational queries on XML data and are issued on XML documents to select sets of element nodes. Due to the wide use of XPath, which is embedded into several languages for querying and manipulating XML data, the problem of efficiently answering XPath queries has received increasing attention from the research community. As the efficiency of computing the answer of an XPath query depends on its size, replacing XPath expressions with equivalent ones having the smallest size is a crucial issue in this direction. This article investigates the minimization problem for a wide fragment of XPath (namely XP[*]), where the use of the most common operators (child, descendant, wildcard and branching) is allowed with some syntactic restrictions. The examined fragment consists of expressions which have not been specifically studied in the relational setting before: neither are they mere conjunctive queries (as the combination of "//" and "*" enables an implicit form of disjunction to be expressed) nor do they coincide with disjunctive ones (as the latter are more expressive). Three main contributions are provided. The "global minimality" property is shown to hold: the minimization of a given XPath expression can be accomplished by removing pieces of the expression, without having to re-formulate it (as for "general" disjunctive queries). Then, the complexity of the minimization problem is characterized, showing that it is the same as the containment problem. Finally, specific forms of XPath expressions are identified, which can be minimized in polynomial time.
引用
收藏
页数:46
相关论文
共 28 条
[1]   BOUNDEDNESS IS UNDECIDABLE FOR DATALOG PROGRAMS WITH A SINGLE RECURSIVE RULE [J].
ABITEBOUL, S .
INFORMATION PROCESSING LETTERS, 1989, 32 (06) :281-287
[2]   Tree pattern query minimization [J].
Amer-Yahia, S ;
Cho, S ;
Lakshmanan, LVS ;
Srivastava, D .
VLDB JOURNAL, 2002, 11 (04) :315-331
[3]   Screening for multiple sclerosis cognitive impairment using a self-administered 15-item questionnaire [J].
Benedict, RHB ;
Munschauer, F ;
Linn, R ;
Miller, C ;
Murphy, E ;
Foley, F ;
Jacobs, L .
MULTIPLE SCLEROSIS JOURNAL, 2003, 9 (01) :95-101
[4]  
Chan Y., 2004, P 30 INT C VER LARG, P156
[5]  
Chandra Ashok K., 1977, STOC'77: Proceedings of the ninth annual ACM symposium on Theory of computing, P77, DOI DOI 10.1145/800105.803397
[6]  
CHEKURI C, 1997, P 6 INT C DAT THEOR, P56
[7]  
DEUTSCH A, 2001, P 8 INT WORKSH KNOWL
[8]   Efficient algorithms for processing XPath queries [J].
Gottlob, G ;
Koch, C ;
Pichler, R .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2005, 30 (02) :444-491
[9]  
KIMELFELD B, 2008, IN PRESS P 11 INT C
[10]   Conjunctive-query containment constraint satisfaction [J].
Kolaitis, PG ;
Vardi, MY .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2000, 61 (02) :302-332