Windowed nearest neighbour method for mining spatio-temporal clusters in the presence of noise

被引:42
作者
Pei, Tao [1 ]
Zhou, Chenghu [1 ]
Zhu, A-Xing [1 ]
Li, Baolin [1 ]
Qin, Chengzhi [1 ]
机构
[1] Chinese Acad Sci, Inst Geog Sci & Nat Resources Res, State Key Lab Resources & Environm Informat Syst, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
nearest neighbour; DBSCAN; cluster; spatio-temporal; windowed; expectation-maximization; DISEASE; FEATURES; TESTS;
D O I
10.1080/13658810903246155
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In a spatio-temporal data set, identifying spatio-temporal clusters is difficult because of the coupling of time and space and the interference of noise. Previous methods employ either the window scanning technique or the spatio-temporal distance technique to identify spatio-temporal clusters. Although easily implemented, they suffer from the subjectivity in the choice of parameters for classification. In this article, we use the windowed kth nearest (WKN) distance (the geographic distance between an event and its kth geographical nearest neighbour among those events from which to the event the temporal distances are no larger than the half of a specified time window width [TWW]) to differentiate clusters from noise in spatio-temporal data. The windowed nearest neighbour (WNN) method is composed of four steps. The first is to construct a sequence of TWW factors, with which the WKN distances of events can be computed at different temporal scales. Second, the appropriate values of TWW (i.e. the appropriate temporal scales, at which the number of false positives may reach the lowest value when classifying the events) are indicated by the local maximum values of densities of identified clustered events, which are calculated over varying TWW by using the expectation-maximization algorithm. Third, the thresholds of the WKN distance for classification are then derived with the determined TWW. In the fourth step, clustered events identified at the determined TWW are connected into clusters according to their density connectivity in geographic-temporal space. Results of simulated data and a seismic case study showed that the WNN method is efficient in identifying spatio-temporal clusters. The novelty of WNN is that it can not only identify spatio-temporal clusters with arbitrary shapes and different spatio-temporal densities but also significantly reduce the subjectivity in the classification process.
引用
收藏
页码:925 / 948
页数:24
相关论文
共 46 条
[1]  
Ankerst M, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P49
[2]  
[Anonymous], 9906010 NW U
[3]   Spatial aspects of MRSA epidemiology: a case study using stochastic simulation, kernel estimation and SaTScan [J].
Bastin, L. ;
Rollason, J. ;
Hilton, A. ;
Pillay, D. ;
Corcoran, C. ;
Elgy, J. ;
Lambert, P. ;
De, P. ;
Worthington, T. ;
Burrows, K. .
INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2007, 21 (07) :811-836
[4]   ST-DBSCAN: An algorithm for clustering spatial-temp oral data [J].
Birant, Derya ;
Kut, Alp .
DATA & KNOWLEDGE ENGINEERING, 2007, 60 (01) :208-221
[5]   Nearest-neighbor clutter removal for estimating features in spatial point processes [J].
Byers, S ;
Raftery, AE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (442) :577-584
[6]   Pattern characteristics of foreshock sequences [J].
Chen, Y ;
Liu, J ;
Ge, HK .
PURE AND APPLIED GEOPHYSICS, 1999, 155 (2-4) :395-408
[7]  
*CHIN SEISM NETW D, 2009, CHIN SEISM NETW CSN
[8]  
CROMWELL PF, 1999, THEIR OWN WORDS CRIM, P50
[9]   MODIFIED RANDOMIZATION TESTS FOR NONPARAMETRIC HYPOTHESES [J].
DWASS, M .
ANNALS OF MATHEMATICAL STATISTICS, 1957, 28 (01) :181-187
[10]  
Ester M., 1996, DENSITY BASED ALGORI, DOI DOI 10.5555/3001460.3001507