A polygon-based clustering and analysis framework for mining spatial datasets

被引:11
作者
Wang, Sujing [1 ]
Eick, Christoph F. [1 ]
机构
[1] Univ Houston, Dept Comp Sci, Houston, TX 77204 USA
关键词
Spatial data mining; Dissimilarity functions for polygons; Polygon clustering; Polygon analysis; Mining related spatial datasets;
D O I
10.1007/s10707-013-0190-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Polygons provide natural representations for many types of geospatial objects, such as countries, buildings, and pollution hotspots. Thus, polygon-based data mining techniques are particularly useful for mining geospatial datasets. In this paper, we propose a polygon-based clustering and analysis framework for mining multiple geospatial datasets that have inherently hidden relations. In this framework, polygons are first generated from multiple geospatial point datasets by using a density-based contouring algorithm called DCONTOUR. Next, a density-based clustering algorithm called Poly-SNN with novel dissimilarity functions is employed to cluster polygons to create meta-clusters of polygons. Finally, post-processing analysis techniques are proposed to extract interesting patterns and user-guided summarized knowledge from meta-clusters. These techniques employ plug-in reward functions that capture a domain expert's notion of interestingness to guide the extraction of knowledge from meta-clusters. The effectiveness of our framework is tested in a real-world case study involving ozone pollution events in Texas. The experimental results show that our framework can reveal interesting relationships between different ozone hotspots represented by polygons; it can also identify interesting hidden relations between ozone hotspots and several meteorological variables, such as outdoor temperature, solar radiation, and wind speed.
引用
收藏
页码:569 / 594
页数:26
相关论文
共 23 条
[1]  
[Anonymous], STAT AIR 2010
[2]  
[Anonymous], 1993, J AGR BIOL ENVIR ST
[3]   COMPUTING SOME DISTANCE FUNCTIONS BETWEEN POLYGONS [J].
ATALLAH, MJ ;
RIBEIRO, CC ;
LIFSCHITZ, S .
PATTERN RECOGNITION, 1991, 24 (08) :775-781
[4]  
Bansal N, 2002, 43 S FDN COMP SCI VA
[5]  
Buchin K, 2009, 22 ACM S COMP GEOM S
[6]  
Caruana R, 2006, 16 IEEE INT C DAT MI
[7]  
Chen C.S., 2009, 13 AS PAC C KNOWL DI
[8]  
CHENG Y, 2000, 8 INT C INT SYST MOL
[9]  
Dhillon IS, 2001, 7 ACM SIGKDD INT C K
[10]  
EDELSBRUNNER H, 1983, IEEE T INFORM THEORY, V29, P551, DOI 10.1109/TIT.1983.1056714