Spatio-Temporal Linkage over Location-Enhanced Services

被引:15
作者
Basik, Fuat [1 ]
Gedik, Bugra [1 ]
Etemoglu, Cagri [2 ]
Ferhatosmanoglu, Hakan [1 ,3 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
[2] Turk Telekom, TR-4349 Istanbul, Turkey
[3] Univ Warwick, Dept Comp Sci, Coventry, W Midlands, England
基金
美国国家科学基金会;
关键词
ENTITY RESOLUTION;
D O I
10.1109/TMC.2017.2711027
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We are witnessing an enormous growth in the volume of data generated by various online services. An important portion of this data contains geographic references, since many of these services are location-enhanced and thus produce spatio-temporal records of their usage. We postulate that the spatio-temporal usage records belonging to the same real-world entity can be matched across records from different location-enhanced services. Linking spatio-temporal records enables data analysts and service providers to obtain information that they cannot derive by analyzing only one set of usage records. In this paper, we develop a new linkage model that can be used to match entities from two sets of spatio-temporal usage records belonging to two different location-enhanced services. This linkage model is based on the concept of k-l diversity-that we developed to capture both spatial and temporal aspects of the linkage. To realize this linkage model in practice, we develop a scalable linking algorithm called ST-Link, which makes use of effective spatial and temporal filtering mechanisms that significantly reduce the search space for matching users. Furthermore, ST-Link utilizes sequential scan procedures to avoid random disk access and thus scales to large datasets. We evaluated our work with respect to accuracy and performance using several datasets. Experiments show that ST-Link is effective in practice for performing spatio-temporal linkage and can scale to large datasets.
引用
收藏
页码:447 / 460
页数:14
相关论文
共 38 条
[1]  
[Anonymous], 2006, 14 ANN ACM INT S ADV
[2]  
[Anonymous], 2007, ACM Transactions on Knowledge Discovery from Data (TKDD), DOI [DOI 10.1145/1217299.1217304, 10.1145/1217299.1217304]
[3]  
[Anonymous], 2012, DATA MATCHING CONCEP, DOI DOI 10.1007/978-3-642-31164-2
[4]  
[Anonymous], 2004, P ACM SIGMOD
[5]  
Bakalov P., 2005, P 6 INT C MOB DAT MA, P86
[6]  
Bakalov P, 2008, LECT NOTES COMPUT SC, V4540, P109
[7]   Swoosh: a generic approach to entity resolution [J].
Benjelloun, Omar ;
Garcia-Molina, Hector ;
Menestrina, David ;
Su, Qi ;
Whang, Steven Euijong ;
Widom, Jennifer .
VLDB JOURNAL, 2009, 18 (01) :255-276
[8]  
Benjelloun Omar., 2006, GENERIC ENTITY RESOL
[9]  
Burdick D., 2011, IEEE Data Engineering Bulletin, V34, P60, DOI [10.2139/ssrn.2666384, DOI 10.2139/SSRN.2666384]
[10]   Modeling Entity Evolution for Temporal Record Matching [J].
Chiang, Yueh-Hsuan ;
Doan, AnHai ;
Naughton, Jeffrey F. .
SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, :1175-1186