Integrating domain heterogeneous data sources using decomposition aggregation queries

被引:6
|
作者
Xu, Jian [1 ]
Pottinger, Rachel [1 ]
机构
[1] Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1Z4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Semantic integration; Aggregation; Query optimization; LOCAL-SEARCH; ALGORITHM; DATABASES;
D O I
10.1016/j.is.2013.06.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The decomposition aggregation query (DAQ) we introduce in this paper extends semantic integration queries by allowing query translation to create aggregate queries based on the DAQ's novel three role structure. We describe the application of DAQs in integrating domain heterogeneous data sources, the new semantics of DAQ answers and the query translation algorithm called "aggregation rewriting". A central problem of optimizing DAQ processing requires determining the data sources towards which the DAQ is translated. Our source selection algorithm has cover-finding and partitioning steps which are optimized to 1. lower the processing overhead while speeding up query answering and 2. eliminate duplicates with minimal overhead. We establish connections between source selection optimizations and classic NP-hard optimizations and resolve the optimization problems with efficient solvers. We empirically study both the DAQ query translation and the source selection algorithms using real-world and synthetic data sets; the results show satisfying scalability both in size of aggregations and data sources for the query translation algorithms and the source selection algorithms save a good amount of computational resources. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:80 / 107
页数:28
相关论文
共 50 条
  • [1] Using Reformulation Trees to Optimize Queries over Distributed Heterogeneous Sources
    Li, Yingjie
    Heflin, Jeff
    SEMANTIC WEB-ISWC 2010, PT I, 2010, 6496 : 502 - 517
  • [2] Unified Access to Heterogeneous Data Sources Using an Ontology
    Mercier, Daniel
    Cheong, Hyunmin
    Tapaswi, Chaitanya
    SEMANTIC TECHNOLOGY (JIST 2018), 2018, 11341 : 104 - 118
  • [3] Matching Attributes Across Overlapping Heterogeneous Data Sources Using Mutual Information
    Zhao, Huimin
    JOURNAL OF DATABASE MANAGEMENT, 2010, 21 (04) : 91 - 110
  • [4] A hybrid domain decomposition method based on aggregation
    Vassilevski, Y
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2004, 11 (04) : 327 - 341
  • [5] Modelling and aggregation of heterogeneous fuzzy data
    Hempel, Arne-Jens
    Herbst, Gernot
    Bocklisch, Steffen F.
    PROCEEDINGS OF THE 7TH CONFERENCE OF THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY (EUSFLAT-2011) AND LFA-2011, 2011, : 1066 - 1073
  • [6] Challenges and Conflicts Integrating Heterogeneous Data Warehouses
    Preis, Marcus
    Seitz, Juergen
    TENTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I AND II, 2011, : 577 - 585
  • [7] An aggregation-based domain decomposition preconditioner for groundwater flow
    Jenkins, EW
    Kees, CE
    Kelley, CT
    Miller, CT
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2001, 23 (02) : 430 - 441
  • [8] Story creation from heterogeneous data sources
    Marat Fayzullin
    V. S. Subrahmanian
    Massimiliano Albanese
    Carmine Cesarano
    Antonio Picariello
    Multimedia Tools and Applications, 2007, 33 : 351 - 377
  • [9] Story creation from heterogeneous data sources
    Fayzullin, Marat
    Subrahmanian, V. S.
    Albanese, Massimiliano
    Cesarano, Carmine
    Picariello, Antonio
    MULTIMEDIA TOOLS AND APPLICATIONS, 2007, 33 (03) : 351 - 377
  • [10] An approach for semantic integration of heterogeneous data sources
    Fusco, Giuseppe
    Aversano, Lerina
    PEERJ COMPUTER SCIENCE, 2020, 2020 (03) : 1 - 30