Top-k Queries over Digital Traces

被引:2
作者
Li, Yifan [1 ]
Yu, Xiaohui [1 ]
Koudas, Nick [2 ]
机构
[1] York Univ, N York, ON, Canada
[2] Univ Toronto, Toronto, ON, Canada
来源
SIGMOD '19: PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2019年
基金
加拿大自然科学与工程研究理事会;
关键词
NEAREST NEIGHBORS; OPTIMIZATION; SEARCH; INDEX; TREE;
D O I
10.1145/3299869.3319857
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent advances in social and mobile technology have enabled an abundance of digital traces (in the form of mobile check-ins, association of mobile devices to specific WiFi hotspots, etc.) revealing the physical presence history of diverse sets of entities (e.g., humans, devices, and vehicles). One challenging yet important task is to identify k entities that are most closely associated with a given query entity based on their digital traces. We propose a suite of indexing techniques and algorithms to enable fast query processing for this problem at scale. We first define a generic family of functions measuring the association between entities, and then propose algorithms to transform digital traces into a lower-dimensional space for more efficient computation. We subsequently design a hierarchical indexing structure to organize entities in a way that closely associated entities tend to appear together. We then develop algorithms to process top-k queries utilizing the index. We theoretically analyze the pruning effectiveness of the proposed methods based on a mobility model which we propose and validate in real life situations. Finally, we conduct extensive experiments on both synthetic and real datasets at scale, evaluating the performance of our techniques both analytically and experimentally, confirming the effectiveness and superiority of our approach over other applicable approaches across a variety of parameter settings and datasets.
引用
收藏
页码:954 / 971
页数:18
相关论文
共 54 条
  • [1] k-Nearest Neighbors on Road Networks: A Journey in Experimentation and In-Memory Implementation
    Abeywickrama, Tenindra
    Cheema, Muhammad Aamir
    Taniar, David
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2016, 9 (06): : 492 - 503
  • [2] Aggarwal Charu C., 2014, FREQUENT PATTERN MIN, V2014
  • [3] Efficient Computation of Top-k Frequent Terms over Spatio-temporal Ranges
    Ahmed, Pritom
    Hasan, Mahbub
    Kashyap, Abhijith
    Hristidis, Vagelis
    Tsotras, Vassilis J.
    [J]. SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, : 1227 - 1241
  • [4] Spatial Queries with Two kNN Predicates
    Aly, Ahmed M.
    Aref, Walid G.
    Ouzzani, Mourad
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11): : 1100 - 1111
  • [5] [Anonymous], 2005, 2005 ACM SIGMOD INT
  • [6] [Anonymous], 1990, P 1990 ACM SIGMOD IN, DOI DOI 10.1145/93597.98741
  • [7] On the resemblance and containment of documents
    Broder, AZ
    [J]. COMPRESSION AND COMPLEXITY OF SEQUENCES 1997 - PROCEEDINGS, 1998, : 21 - 29
  • [8] Chang YC, 2000, SIGMOD RECORD, V29, P391, DOI 10.1145/335191.335433
  • [9] Metric All-k-Nearest-Neighbor Search
    Chen, Lu
    Gao, Yunjun
    Chen, Gang
    Zhang, Haida
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (01) : 98 - 112
  • [10] Chen Z., 2010, P ACM SIGMOD INT C M, P255