Schema matching based on position of attribute in query statement

被引:2
作者
Ding, Guohui [1 ]
Sun, Tianhe [1 ]
机构
[1] Shenyang Aerosp Univ, Shenyang, Peoples R China
基金
中国国家自然科学基金;
关键词
Schema matching; Database integration; Query log; Ant Colony Optimization; Attribute position; Query statement; COLONY; OPTIMIZATION;
D O I
10.1016/j.knosys.2014.11.005
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attribute-level schema matching is a critical step in numerous database applications, such as DataSpaces, Ontology Merging and Schema Integration. There exist many researches on this topic, however, they all ignore evidences about the positions of attributes in query statements, which are crucial to find high-quality matches between schema attributes. In this paper, we propose a novel matching technique based on the positions of attributes appearing in the schema structure of query results. The positions of attributes in query results embody the extent of the importance of an attribute for the user browsing the query results. The core idea of our approach is to collect the statistics about attribute positions from query logs to find correspondences between attributes (matches). Our method works in three phases. The first phase is to design a matrix to record the statistics about attribute positions. Then, we employ two scoring functions to measure the similarities between collected statistics of two schemas to be matched. Finally, we employ a traditional algorithm to find the optimal mapping. Furthermore, our approach can be combined with other existing matchers to obtain more accurate matching results. An experimental study shows that our approach is effective and has good performance. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:41 / 51
页数:11
相关论文
共 39 条
  • [1] [Anonymous], P PAR PROBL SOLV NAT
  • [2] [Anonymous], 2012, P ACM SIGMOD INT C M, DOI DOI 10.1145/2213836.2213848
  • [3] [Anonymous], 2006, P 32 INT C VERY LARG
  • [4] [Anonymous], P ACM SIGMOD
  • [5] [Anonymous], 2006, Proceedings of the 32nd international conference on Very large data bases, VLDB '06
  • [6] Bilke A, 2005, PROC INT CONF DATA, P69
  • [7] Bohannon P., 2006, P 32 INT C VERY LARG, P307
  • [8] Colorni A., 1991, P 1 EUR C ART LIF PA
  • [9] Validating multi-column schema matchings by type
    Dai, Bing Tian
    Koudas, Nick
    Srivastava, Divesh
    Tung, Anthony K. H.
    Venkatasubramanian, Suresh
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 120 - +
  • [10] Das Sarma A., 2008, SIGMOD C, P861, DOI [DOI 10.1145/1376616.1376702, 10.1145/1376616.1376702]