Towards improving XML search by using structure clustering technique

被引:1
作者
Shalabi, Rehab [1 ]
Elfatatry, Ahmed [1 ]
机构
[1] Univ Alexandria, Inst Grad Studies & Res, Alexandria, Egypt
关键词
Clustering; EXCLS; information retrieval; XCLS; XEdge; XML search engine; XQuery; EFFICIENT;
D O I
10.1177/0165551514560523
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Searching large XML repositories is a challenging research problem. The application of clustering on a large repository before performing a search enhances the search process significantly. Clustering reduces a search space into smaller XML collections that can be better searched. In this work, we present an enhanced XML clustering by structure method. Also, we introduce a new representation of XML structure that keeps all characteristics of XML structure without summarization. Then, we perform a benchmark comparison between the search results of our improved method to SAXON and Qizx XML XQuery processors. The comparison focuses on search processing time and accuracy of the results using different sizes of datasets for both homogeneous and heterogeneous XML documents. The attained results show better accuracy at the same level of performance.
引用
收藏
页码:146 / 166
页数:21
相关论文
共 38 条
[1]   XML Data Clustering: An Overview [J].
Algergawy, Alsayed ;
Mesiti, Marco ;
Nayak, Richi ;
Saake, Gunter .
ACM COMPUTING SURVEYS, 2011, 43 (04)
[2]  
[Anonymous], 2014, INEX 2010 DATA CENTR
[3]  
[Anonymous], 2013, The Internet Movie Database
[4]  
[Anonymous], P INT WORLD WID WEB
[5]  
[Anonymous], 2004, Proceedings of the Thirtieth international conference on Very Large Databases-Volume
[6]  
[Anonymous], 2008, IEEE DATA ENG B
[7]  
Antonellis P, 2008, APPLIED COMPUTING 2008, VOLS 1-3, P1081
[8]   Efficient schema-based XML-to-relational data mapping [J].
Atay, Mustafa ;
Chebotko, Artem ;
Liu, Dapeng ;
Lu, Shiyong ;
Fotouhi, Farshad .
INFORMATION SYSTEMS, 2007, 32 (03) :458-476
[9]   Keyword searching and browsing in Databases using BANKS [J].
Bhalotia, G ;
Hulgeri, A ;
Nakhe, C ;
Chakrabarti, S ;
Sudarshan, S .
18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, :431-440
[10]  
Cerami E., 2005, XML BIOINFORMATICS, V1