Similarity measures for OLAP sessions

被引:38
作者
Aligon, Julien [1 ]
Golfarelli, Matteo [2 ]
Marcel, Patrick [1 ]
Rizzi, Stefano [2 ]
Turricchia, Elisa [2 ]
机构
[1] Univ Tours, Lab Informat, Tours, France
[2] Univ Bologna, DISI, I-40136 Bologna, Italy
关键词
OLAP; Similarity measures; Query comparison; Sequence comparison;
D O I
10.1007/s10115-013-0614-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
OLAP queries are not normally formulated in isolation, but in the form of sequences called OLAP sessions. Recognizing that two OLAP sessions are similar would be useful for different applications, such as query recommendation and personalization; however, the problem of measuring OLAP session similarity has not been studied so far. In this paper, we aim at filling this gap. First, we propose a set of similarity criteria derived from a user study conducted with a set of OLAP practitioners and researchers. Then, we propose a function for estimating the similarity between OLAP queries based on three components: the query group-by set, its selection predicate, and the measures required in output. To assess the similarity of OLAP sessions, we investigate the feasibility of extending four popular methods for measuring similarity, namely the Levenshtein distance, the Dice coefficient, the tf-idf weight, and the Smith-Waterman algorithm. Finally, we experimentally compare these four extensions to show that the Smith-Waterman extension is the one that best captures the users' criteria for session similarity.
引用
收藏
页码:463 / 489
页数:27
相关论文
共 34 条
[1]  
Abiteboul S., 1995, Foundations of databases, V8
[2]  
Agrawal R., 2006, P ACM SPECIAL INTERE, P383
[3]   SQL QueRIE Recommendations [J].
Akbarnejad, Javad ;
Chatzopoulou, Gloria ;
Eirinaki, Magdalini ;
Koshy, Suju ;
Mittal, Sarika ;
On, Duc ;
Polyzotis, Neoklis ;
Varman, Jothi S. Vindhiya .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (02) :1597-1600
[4]  
Aligon J, 2011, LECT NOTES COMPUT SC, V6909, P84, DOI 10.1007/978-3-642-23737-9_7
[5]  
[Anonymous], 2003, IIWeb
[6]  
Aouiche K, 2006, LECT NOTES COMPUT SC, V4152, P81
[7]  
Baikousi E, 2011, PROC INT CONF DATA, P171, DOI 10.1109/ICDE.2011.5767869
[8]  
Brown P. F., 1992, Computational Linguistics, V18, P467
[9]  
Bustos B, 2011, PROC INT CONF DATA, P1362, DOI 10.1109/ICDE.2011.5767955
[10]  
Chatzopoulou G, 2009, LECT NOTES COMPUT SC, V5566, P3, DOI 10.1007/978-3-642-02279-1_2