View-based query processing: On the relationship between rewriting, answering and losslessness

被引:24
作者
Calvanese, Diego
De Giacomo, Giuseppe
Lenzerini, Maurizio
Vardi, Moshe Y.
机构
[1] Free Univ Bozen Bolzano, Fac Comp Sci, I-39100 Bolzano, Italy
[2] Univ Roma La Sapienza, Dipartimento Informat & Sistemist, I-00198 Rome, Italy
[3] Rice Univ, Dept Comp Sci, Houston, TX 77251 USA
基金
美国国家科学基金会; 欧盟地平线“2020”;
关键词
query containment; query rewriting; query answering; losslessness; conjunctive queries; regular path queries; semistructured data;
D O I
10.1016/j.tcs.2006.11.006
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As a result of the extensive research in view-based query processing, three notions have been identified as fundamental, namely rewriting, answering, and losslessness. Answering amounts to computing the tuples satisfying the query in all databases consistent with the views. Rewriting consists in first reformulating the query in terms of the views and then evaluating the rewriting over the view extensions. Losslessness holds if we can answer the query by solely relying on the content of the views. While the mutual relationship between these three notions is easy to identify in the case of conjunctive queries, the terrain of notions gets considerably more complicated going beyond such a query class. In this paper, we revisit the notions of answering, rewriting, and losslessness and clarify their relationship in the setting of semistructured databases, and in particular for the basic query class in this setting, i.e., two-way regular path queries. Our first result is a clean explanation of the relationship between answering and rewriting, in which we characterize rewriting as a "linear approximation" of query answering. We show that applying this linear approximation to the constraint-satisfaction framework yields an elegant automata-theoretic approach to query rewriting. As for losslessness, we show that there are indeed two distinct interpretations for this notion, namely with respect to answering, and with respect to rewriting. We also show that the constraint-theoretic approach and the automata-theoretic approach can be combined to give algorithmic characterization of the various facets of losslessness. Finally, we deal with the problem of coping with loss, by considering mechanisms aimed at explaining lossiness to the user. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:169 / 182
页数:14
相关论文
共 33 条
  • [1] Querying documents in object databases
    Abiteboul S.
    Cluet S.
    Christophides V.
    Milo T.
    Moerkotte G.
    Siméon J.
    [J]. International Journal on Digital Libraries, 1997, 1 (1) : 5 - 19
  • [2] Abiteboul S., 1998, Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1998, P254, DOI 10.1145/275487.275516
  • [3] Abiteboul S, 1997, LECT NOTES COMPUT SC, V1186, P1
  • [4] Abiteboul S., 1999, DATA WEB RELATIONS S
  • [5] Afrati F. N., 2002, PODS, P209
  • [6] [Anonymous], 1978, LOGIC DATA BASES
  • [7] Buneman P., 1997, Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, PODS 1997, P117, DOI 10.1145/263661.263675
  • [8] Buneman P, 1996, P ACM SIGMOD INT C M, P505
  • [9] Calvanese D, 2003, SIGMOD REC, V32, P83, DOI 10.1145/959060.959076
  • [10] Rewriting of regular expressions and regular path queries
    Calvanese, D
    De Giacomo, G
    Lenzerini, M
    Vardi, MY
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2002, 64 (03) : 443 - 465