Fraction-Score: A New Support Measure for Co-location Pattern Mining

被引:24
作者
Chan, Harry Kai-Ho [1 ]
Long, Cheng [2 ]
Yan, Da [3 ]
Wong, Raymond Chi-Wing [1 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Nanyang Technol Univ, Singapore, Singapore
[3] Univ Alabama Birmingham, Birmingham, AL USA
来源
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019) | 2019年
关键词
FRAMEWORK; ALGORITHM;
D O I
10.1109/ICDE.2019.00136
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Co-location patterns are well-established on spatial objects with categorical labels, which capture the phenomenon that objects with certain labels are often located in close geographic proximity. Similar to frequent itemsets, co-location patterns are defined based on a support measure which quantifies the popularity (or prevalence) of a pattern candidate (a label set). Quite a few support measures exist for defining co-location patterns and they share an idea of counting the number of instances of a given label set C as its support, where an instance of C is an object set whose objects carry all the labels in C and are located close to one another. Unfortunately, these measures suffer from various weaknesses, e.g., some fail to capture all possible instances while some others overlook the cases when multiple instances overlap. In this paper, we propose a new measure called Fraction-Score whose idea is to count instances fractionally if they overlap. Compared to existing measures, Fraction-Score not only captures all possible instances, but also handles the cases where instances overlap appropriately (so that the supports defined are more meaningful and consistent with the desirable anti-monotonicity property). To solve the co-location pattern mining problem based on Fraction-Score, we develop efficient algorithms which are significantly faster than a baseline that adapts the state-of-the-art. We conduct extensive experiments using both real and synthetic datasets, which verified the superiority of Fraction Score and also the efficiency of our developed algorithms.
引用
收藏
页码:1514 / 1525
页数:12
相关论文
共 36 条
[1]  
Agrawal R., P 20 INT C VERY LARG, DOI DOI 10.1055/S-2007-996789
[2]   Parallel GPU-based Plane-Sweep Algorithm for Construction of iCPI-Trees [J].
Andrzejewski, Witold ;
Boinski, Pawel .
JOURNAL OF DATABASE MANAGEMENT, 2015, 26 (03) :1-20
[3]   Mining Statistically Significant Co-location and Segregation Patterns [J].
Barua, Sajib ;
Sander, Joerg .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) :1185-1199
[4]  
Boinski Pawel, 2012, Data Warehousing and Knowledge Discovery. Proceedings of the 14th International Conference, DaWaK 2012, P279, DOI 10.1007/978-3-642-32584-7_23
[5]   Zonal co-location pattern discovery with dynamic parameters [J].
Celik, Mete ;
Kang, James M. ;
Shekhar, Shashi .
ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, :433-438
[6]   On Generalizing Collective Spatial Keyword Queries [J].
Chan, Harry Kai-Ho ;
Long, Cheng ;
Wong, Raymond Chi-Wing .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (09) :1712-1726
[7]  
Cressie N., 1992, Terra Nova, V4, P613, DOI [10.1111/j.1365-3121.1992.tb00605.x, DOI 10.1111/J.1365-3121.1992.TB00605.X]
[8]  
Ding W, 2006, IEEE DATA MINING, P851
[9]  
Eick C.F., 2008, Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, P30
[10]  
Estivill-Castro Vladimir., 2001, Proceedings of the 6th International Conference on Geocomputation, P24