Data exchange: semantics and query answering

被引:529
作者
Fagin, R
Kolaitis, PG
Miller, RJ
Popa, L
机构
[1] Univ Calif Santa Cruz, Santa Cruz, CA 95064 USA
[2] Univ Toronto, Toronto, ON, Canada
关键词
data exchange; data integration; dependencies; universal solution; chase; query answering; certain answers; computational complexity; first-order inexpressibility;
D O I
10.1016/j.tcs.2004.10.033
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data exchange is the problem of taking data structured under a source schema and creating an instance of a target schema that reflects the source data as accurately as possible. In this paper, we address foundational and algorithmic issues related to the semantics of data exchange and to the query answering problem in the context of data exchange. These issues arise because, given a source instance, there may be many target instances that satisfy the constraints of the data exchange problem. We give an algebraic specification that selects, among all solutions to the data exchange problem, a special class of solutions that we call universal. We show that a universal solution has no more and no less data than required for data exchange and that it represents the entire space of possible solutions. We then identify fairly general, yet practical, conditions that guarantee the existence of a universal solution and yield algorithms to compute a canonical universal solution efficiently. We adopt the notion of the "certain answers" in indefinite databases for the semantics for query answering in data exchange. We investigate the computational complexity of computing the certain answers in this context and also address other algorithmic issues that arise in data exchange. In particular, we study the problem of computing the certain answers of target queries by simply evaluating them on a canonical universal solution, and we explore the boundary of what queries can and cannot be answered this way, in a data exchange setting. 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:89 / 124
页数:36
相关论文
共 34 条
  • [1] Abiteboul S., 1998, Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1998, P254, DOI 10.1145/275487.275516
  • [2] Abiteboul S, 1997, LECT NOTES COMPUT SC, V1186, P351
  • [3] ABITEBOUL S, 2000, UNPUB COMPLEXITY ANS
  • [4] ARENAS P, 2004, P ACM S PRINC DAT SY, P229
  • [5] A PROOF PROCEDURE FOR DATA DEPENDENCIES
    BEERI, C
    VARDI, MY
    [J]. JOURNAL OF THE ACM, 1984, 31 (04) : 718 - 741
  • [6] CALI A, 2002, P 14 C ADV INF SYST, P262
  • [7] INCLUSION DEPENDENCIES AND THEIR INTERACTION WITH FUNCTIONAL-DEPENDENCIES
    CASANOVA, MA
    FAGIN, R
    PAPADIMITRIOU, CH
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1984, 28 (01) : 29 - 59
  • [8] Chandra Ashok K., 1977, STOC'77: Proceedings of the ninth annual ACM symposium on Theory of computing, P77, DOI DOI 10.1145/800105.803397
  • [9] Cosmadakis S. S., 1986, ADV COMPUTING RES, V3, P163
  • [10] Deutsch A, 2003, LECT NOTES COMPUT SC, V2572, P225