XML schema clustering with semantic and hierarchical similarity measures

被引:32
作者
Nayak, Richi [1 ]
Iryadi, Wina [1 ]
机构
[1] Queensland Univ Technol, Sch Informat Sci, Brisbane, Qld 4001, Australia
关键词
clustering; data mining; document mining; XML; semi-structured data; semantic similarity; structural similarity; schema matching;
D O I
10.1016/j.knosys.2006.08.006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the growing popularity of XML as the data representation language, collections of the XML data are exploded in numbers. The methods are required to manage and discover the useful information from them for the improved document handling. We present a schema clustering process by organising the heterogeneous XML schemas into various groups. The methodology considers not only the linguistic and the context of the elements but also the hierarchical structural similarity. We support our findings with experiments and analysis. (c) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:336 / 349
页数:14
相关论文
共 33 条
[11]  
Fellbaum C, 1998, WORDNET ELECT LEXICA
[12]   Fast detection of XML structural similarity [J].
Flesca, S ;
Manco, G ;
Masciari, E ;
Pontieri, L ;
Pugliese, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (02) :160-175
[13]  
GUARDALBEN G, INTEGRATING XML RELA
[14]  
JEONG E, 2001, 10 INT C INF KNOWL M
[15]  
Koloniari G, 2005, SIGMOD REC, V34, P6, DOI 10.1145/1083784.1083788
[16]  
KURGAN L, 2002, ICMLA
[17]  
LEE JW, 2004, ADVIS IZM TURK
[18]  
LEE LM, 2002, 11 ACM INT C INF KNO
[19]   On the use of hierarchical information in sequential mining-based XML document similarity computation [J].
Leung, HP ;
Chung, FL ;
Chan, SCF .
KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 7 (04) :476-498
[20]  
MADHAVAN J, 2001, 27 VLDB ROM IT