Efficient Computation of Top-k Frequent Terms over Spatio-temporal Ranges

被引:20
作者
Ahmed, Pritom [1 ]
Hasan, Mahbub [1 ]
Kashyap, Abhijith [1 ]
Hristidis, Vagelis [1 ]
Tsotras, Vassilis J. [1 ]
机构
[1] UC Riverside, Riverside, CA 92521 USA
来源
SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2017年
基金
美国国家科学基金会;
关键词
Top-K; Spatio-Temporal Databases; Social Networks;
D O I
10.1145/3035918.3064032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The wide availability of tracking devices has drastically increased the role of geolocation in social networks, resulting in new commercial applications; for example, marketers can identify current trending topics within a region of interest and focus their products accordingly. In this paper we study a basic analytics query on geo-tagged data, namely: given a spatiotemporal region, find the most frequent terms among the social posts in that region. While there has been prior work on keyword search on spatial data (find the objects nearest to the query point that contain the query keywords), and on group keyword search on spatial data (retrieving groups of objects), our problem is different in that it returns keywords and aggregated frequencies as output, instead of having the keyword as input. Moreover, we differ from works addressing the streamed version of this query in that we operate on large, disk resident data and we provide exact answers. We propose an index structure and algorithms to efficiently answer such top-k spatiotemporal range queries, which we refer as Top-k Frequent Spatiotemporal Terms (kFST) queries. Our index structure employs an R-tree augmented by top-k sorted term lists (STLs), where a key challenge is to balance the size of the index to achieve faster execution and smaller space requirements. We theoretically study and experimentally validate the ideal length of the stored term lists, and perform detailed experiments to evaluate the performance of the proposed methods compared to baselines on real datasets.
引用
收藏
页码:1227 / 1241
页数:15
相关论文
共 26 条
[1]   Geo-Social Keyword Search [J].
Ahuja, Ritesh ;
Armenatzoglou, Nikos ;
Papadias, Dimitris ;
Fakas, George J. .
ADVANCES IN SPATIAL AND TEMPORAL DATABASES (SSTD 2015), 2015, 9239 :431-450
[2]  
[Anonymous], P 23 INT C DAT ENG I
[3]  
[Anonymous], 1990, P 1990 ACM SIGMOD IN, DOI DOI 10.1145/93597.98741
[4]   Geo-Social Ranking: functions and query processing [J].
Armenatzoglou, Nikos ;
Ahuja, Ritesh ;
Papadias, Dimitris .
VLDB JOURNAL, 2015, 24 (06) :783-799
[5]  
Cao P., 2004, P 23 ANN ACM S PRINC, P206, DOI DOI 10.1145/1011767.1011798
[6]   Efficient Processing of Spatial Group Keyword Queries [J].
Cao, Xin ;
Cong, Gao ;
Guo, Tao ;
Jensen, Christian S. ;
Ooi, Beng Chin .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2015, 40 (02)
[7]   Retrieving Top-k Prestige-Based Relevant Spatial Web Objects [J].
Cao, Xin ;
Cong, Gao ;
Jensen, Christian S. .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01) :373-384
[8]   Spatial Keyword Query Processing: An Experimental Evaluation [J].
Chen, Lisi ;
Cong, Gao ;
Jensen, Christian S. ;
Wu, Dingming .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (03) :217-228
[9]  
Cong G., 2009, PROC VLDB ENDOW, V2, P337, DOI DOI 10.14778/1687627.1687666
[10]   Keyword search on spatial databases [J].
De Felipe, Ian ;
Hristidis, Vagelis ;
Rishe, Naphtali .
2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, :656-+