Search and Aggregation in XML Documents

被引:1
作者
Habi, Abdelmalek [1 ]
Effantin, Brice [1 ]
Kheddouci, Hamamache [1 ]
机构
[1] Univ Lyon 1, Univ Lyon, CNRS LIRIS, UMR 5205, F-69622 Lyon, France
来源
DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT I | 2017年 / 10438卷
关键词
D O I
10.1007/978-3-319-64468-4_22
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information retrieval encounters a migration from the traditional paradigm (returning an ordered list of responses) to the aggregate search paradigm (grouping the most comprehensive and relevant answers into one final aggregated document). Nowadays extensible markup language (XML) is an important standard of information exchange and representation. Usually the tree representation of documents and queries is used to process them. It allows to consider the XML documents retrieval as a tree matching problem between the document trees and the query tree. Several paradigms for retrieving XML documents have been proposed in the literature but only a few of them try to aggregate a set of XML documents in order to provide more significant answers for a given query. In this paper, we propose and evaluate an aggregated search method to obtain the most accurate and richest answers in XML fragment search. Our search method is based on the Top-k Approximate Sub-tree Matching (TASM) algorithm and a new similarity function is proposed to improve the returned fragments. Then an aggregation process is presented to generate a single aggregate response containing the most relevant, exhaustive and non-redundant information given by the fragments. The method is evaluated on two real world datasets. Experimentations show that it generates good results in terms of relevance and quality.
引用
收藏
页码:290 / 304
页数:15
相关论文
共 30 条
[1]  
[Anonymous], 2012, P 21 ACM INT C INFOR, DOI DOI 10.1145/2396761.2398432
[2]  
[Anonymous], 28 INT C VLDB
[3]  
[Anonymous], IEEE DATA ENG B
[4]  
[Anonymous], 2012 INT C INF TECHN
[5]  
[Anonymous], DELOS WORKSH INF SEE
[6]  
[Anonymous], INEX 2003 WORKSH P C
[7]  
Arguello Jaime, 2015, Advances in Information Retrieval. 37th European Conference on IR Research (ECIR 2015). Proceedings: LNCS 9022, P25, DOI 10.1007/978-3-319-16354-3_3
[8]  
Arguello J, 2011, LECT NOTES COMPUT SC, V6611, P141, DOI 10.1007/978-3-642-20161-5_15
[9]   TASM: Top-k Approximate Subtree Matching [J].
Augsten, Nikolaus ;
Barbosa, Denilson ;
Boehlen, Michael ;
Palpanas, Themis .
26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, :353-364
[10]  
Bessai-Mechmache Fatma Zohra, 2012, Journal of Emerging Technologies in Web Intelligence, V4, P181, DOI 10.4304/jetwi.4.2.181-188