Integrating domain heterogeneous data sources using decomposition aggregation queries

被引:6
|
作者
Xu, Jian [1 ]
Pottinger, Rachel [1 ]
机构
[1] Univ British Columbia, Dept Comp Sci, Vancouver, BC V6T 1Z4, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Semantic integration; Aggregation; Query optimization; LOCAL-SEARCH; ALGORITHM; DATABASES;
D O I
10.1016/j.is.2013.06.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The decomposition aggregation query (DAQ) we introduce in this paper extends semantic integration queries by allowing query translation to create aggregate queries based on the DAQ's novel three role structure. We describe the application of DAQs in integrating domain heterogeneous data sources, the new semantics of DAQ answers and the query translation algorithm called "aggregation rewriting". A central problem of optimizing DAQ processing requires determining the data sources towards which the DAQ is translated. Our source selection algorithm has cover-finding and partitioning steps which are optimized to 1. lower the processing overhead while speeding up query answering and 2. eliminate duplicates with minimal overhead. We establish connections between source selection optimizations and classic NP-hard optimizations and resolve the optimization problems with efficient solvers. We empirically study both the DAQ query translation and the source selection algorithms using real-world and synthetic data sets; the results show satisfying scalability both in size of aggregations and data sources for the query translation algorithms and the source selection algorithms save a good amount of computational resources. (C) 2013 Elsevier Ltd. All rights reserved.
引用
收藏
页码:80 / 107
页数:28
相关论文
共 50 条
  • [21] Aggregation of the thermo-optical properties for a rugged heterogeneous surface in the infrared domain: A multiscale model
    Pallotta, S.
    Briottet, X.
    Miesch, C.
    Kerr, Y.
    GLOBAL DEVELOPMENTS IN ENVIRONMENTAL EARTH OBSERVATION FROM SPACE, 2006, : 151 - +
  • [22] ADAPTIVE AGGREGATION-BASED DOMAIN DECOMPOSITION MULTIGRID FOR THE LATTICE WILSON-DIRAC OPERATOR
    Frommer, A.
    Kahl, K.
    Krieg, S.
    Leder, B.
    Rottmann, M.
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (04) : A1581 - A1608
  • [23] A SEMANTIC APPROACH TO INTEGRATING XML SCHEMAS USING DOMAIN ONTOLOGIES
    Kang, Haeran
    Lee, Kyong-Ho
    COMPUTING AND INFORMATICS, 2011, 30 (04) : 857 - 879
  • [24] Global Propagation Method for Predicting Protein Function by Integrating Multiple Data Sources
    Meng, Jun
    Zhang, Xin
    Luan, Yushi
    CURRENT BIOINFORMATICS, 2016, 11 (02) : 186 - 194
  • [25] Aggregation and decision making using ranked data
    Bargagliotti, Anna E.
    MATHEMATICAL SOCIAL SCIENCES, 2009, 58 (03) : 354 - 366
  • [26] Obscure: Information-Theoretically Secure, Oblivious, and Verifiable Aggregation Queries on Secret-Shared Outsourced Data
    Gupta, Peeyush
    Li, Yin
    Mehrotra, Sharad
    Panwar, Nisha
    Sharma, Shantanu
    Almanee, Sumaya
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (02) : 843 - 864
  • [27] Information Retrieval from Heterogeneous Data Sources: An Application for Managing Medical Records
    Rosa-Paz, Darien
    Perez-Vazquez, Ramiro
    Fernandez-Luna, Juan M.
    Huete, Juan F.
    ENTERPRISE INFORMATION SYSTEMS, PT 3, 2011, 221 : 146 - +
  • [28] A Novel Cluster Head Selection and Data Aggregation Protocol for Heterogeneous Wireless Sensor Network
    Rawat, Piyush
    Chauhan, Siddhartha
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (02) : 1971 - 1986
  • [29] Semantic Integration of Heterogeneous Databases of Same Domain Using Ontology
    Asfand-E-Yar, Muhammad
    Ali, Ramis
    IEEE ACCESS, 2020, 8 : 77903 - 77919
  • [30] A general framework of multiple coordinative data fusion modules for real-time and heterogeneous data sources
    Kashinath, Shafiza Ariffin
    Mostafa, Salama A.
    Lim, David
    Mustapha, Aida
    Hafit, Hanayanti
    Darman, Rozanawati
    JOURNAL OF INTELLIGENT SYSTEMS, 2021, 30 (01) : 947 - 965