Spatial Clustering Overview and Comparison: Accuracy, Sensitivity, and Computational Expense

被引:92
作者
Grubesic, Tony H. [1 ]
Wei, Ran [2 ]
Murray, Alan T. [1 ]
机构
[1] Drexel Univ, Ctr Spatial Analyt & Geocomputat, Coll Comp & Informat, Philadelphia, PA 19104 USA
[2] Univ Utah, Dept Geog, Salt Lake City, UT 84112 USA
关键词
hot spots; cluster analysis; method selection; knowledge discovery; scale; GENETIC ALGORITHM; POWER COMPARISONS; DISEASE CLUSTERS; STATISTICS; ASSOCIATION; LIKELIHOOD; MULTIPLE; PATTERN; TESTS;
D O I
10.1080/00045608.2014.958389
中图分类号
P9 [自然地理学]; K9 [地理];
学科分类号
0705 ; 070501 ;
摘要
Cluster analysis continues to be an important exploratory technique in scientific inquiry. It is used widely in geography, public health, criminology, ecology, and many other fields. Spatial cluster detection is driven by geographic information corresponding to the location of activities, requiring appropriate and meaningful treatment of space and spatial relationships combined with observed attributes of location and events. To date, this has meant utilizing dedicated measures and techniques to structure and account for distance, neighbors, contiguity, irregular geographic morphology, and so on. Unfortunately, all spatial clustering approaches, regardless of their theoretical underpinning, statistical foundation, or mathematical specification, have limitations in accuracy, sensitivity, and the computational effort required for identifying clusters. As a result, a major challenge in practice is determining which technique(s) will provide the most meaningful insights for a particular substantive issue or planning context. The purpose of this article is to provide an overview and evaluation of spatial clustering techniques, identifying the strengths and weaknesses of the most widely applied approaches. Results suggest that performance varies significantly in terms of accuracy, sensitivity, and computational expense. This is noteworthy because the misidentification of clusters, whether false positives or false negatives, has the potential to bias not only hypothesis formulation but also pragmatic facets of policy, process, and planning efforts within a region.
引用
收藏
页码:1134 / 1155
页数:22
相关论文
共 76 条
[1]  
Aldenderfer MS., 1984, CLUSTER ANAL QUANTIT
[2]   Using AMOEBA to create a spatial weights matrix and identify spatial clusters [J].
Aldstadt, Jared ;
Getis, Arthur .
GEOGRAPHICAL ANALYSIS, 2006, 38 (04) :327-343
[3]  
[Anonymous], 1995, Interactive spatial data analysis
[4]   LOCAL INDICATORS OF SPATIAL ASSOCIATION - LISA [J].
ANSELIN, L .
GEOGRAPHICAL ANALYSIS, 1995, 27 (02) :93-115
[5]  
Anselin Luc., 2013, OXFORD HDB QUANTITAT, V2, P154
[6]   THE DETECTION OF CLUSTERS IN RARE DISEASES [J].
BESAG, J ;
NEWELL, J .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1991, 154 :143-155
[7]  
Block CarolynRebecca., 1995, Crime analysis through computer mapping, P15
[8]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[9]   Geographically weighted regression: A method for exploring spatial nonstationarity [J].
Brunsdon, C ;
Fotheringham, AS ;
Charlton, ME .
GEOGRAPHICAL ANALYSIS, 1996, 28 (04) :281-298
[10]   Penalized likelihood and multi-objective spatial scans for the detection and inference of irregular clusters [J].
Cancado, Andre L. F. ;
Duarte, Anderson R. ;
Duczmal, Luiz H. ;
Ferreira, Sabino J. ;
Fonseca, Carlos M. ;
Gontijo, Eliane C. D. M. .
INTERNATIONAL JOURNAL OF HEALTH GEOGRAPHICS, 2010, 9