XML Matchers: Approaches and challenges

被引:19
作者
Agreste, Santa [1 ]
De Meo, Pasquale [2 ]
Ferrara, Emilio [3 ]
Ursino, Domenico [4 ]
机构
[1] Univ Messina, Dept Math & Informat, I-98166 Messina, Italy
[2] Univ Messina, Dept Ancient & Modern Civilizat, I-98166 Messina, Italy
[3] Indiana Univ Bloomington, Sch Informat & Comp, Bloomington, IN USA
[4] Univ Mediterranea Reggio Calabria, Dept Informat Infrastruct & Sustainable Energy En, I-89122 Reggio Di Calabria, Italy
关键词
Schema Matching; DTD; XML Schema; XSD; XML source clustering; Uncertainty management in XML Matchers; SIMILARITY; SCHEMAS; UNCERTAINTY; INTEGRATION; DOCUMENTS; ALGORITHM; FRAMEWORK; SYSTEMS;
D O I
10.1016/j.knosys.2014.04.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:190 / 209
页数:20
相关论文
共 103 条
[81]   A survey of approaches to automatic schema matching [J].
Rahm, E ;
Bernstein, PA .
VLDB JOURNAL, 2001, 10 (04) :334-350
[82]  
Rahm E, 2004, SIGMOD REC, V33, P26, DOI 10.1145/1041410.1041415
[83]  
Rahm E, 2011, DATA CENTRIC SYST AP, P3, DOI 10.1007/978-3-642-16518-4_1
[84]  
Roitman H, 2008, LECT NOTES COMPUT SC, V5231, P538, DOI 10.1007/978-3-540-87877-3_50
[85]  
Shvaiko P, 2005, LECT NOTES COMPUT SC, V3730, P146
[86]   Ontology Matching: State of the Art and Future Challenges [J].
Shvaiko, Pavel ;
Euzenat, Jerome .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (01) :158-176
[87]  
Shvaiko P, 2010, SEMANTIC WEB INFORMATION MANAGEMENT, P183, DOI 10.1007/978-3-642-04329-1_9
[88]  
Smiljanic M, 2005, LECT NOTES COMPUT SC, V3588, P333
[89]  
Suchanek F.M., 2007, P 16 INT C WORLD WID, P697, DOI 10.1145/1242572.1242667
[90]   Exploring dictionary-based semantic relatedness in labeled tree data [J].
Tagarelli, Andrea .
INFORMATION SCIENCES, 2013, 220 :244-268