A dynamic data granulation through adjustable fuzzy clustering

被引:29
作者
Pedrycz, Witold [1 ,2 ]
机构
[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6R 2G7, Canada
[2] Polish Acad Sci, Syst Res Inst, PL-01447 Warsaw, Poland
基金
加拿大自然科学与工程研究理事会;
关键词
Dynamic clustering; Cluster split and cluster merge; Data dynamics; Data snapshot; Fuzzy clustering; Reconstruction criterion;
D O I
10.1016/j.patrec.2008.07.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we develop a concept of dynamic data granulation realized in presence of incoming data organized in the form of so-called data snapshots. For each of these snapshots we reveal a structure by running fuzzy clustering. The proposed algorithm of adjustable fuzzy C-means (FCM) exhibits a number Of useful features which directly associate with the dynamic nature of the underlying data: (a) the number of clusters is adjusted from one data snapshot to another in order to capture the varying structure of patterns and its complexity, (b) continuity between the consecutively discovered structures is retained, viz the clusters formed for a certain data snapshot are constructed as a result of evolving the clusters discovered in the predeceasing snapshot. We present a detailed clustering algorithm in which the mechanisms of adjustment of information granularity (the number of clusters) become the result of solutions to well-defined optimization tasks. The cluster splitting is guided by conditional fuzzy C-means (FCM) while cluster merging involves two neighboring prototypes. The criterion used to control the level of information granularity throughout the process is guided by a reconstruction criterion which quantifies an error resulting from pattern granulation and cle-granulation. Numeric experiments provide a suitable illustration of the approach. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:2059 / 2066
页数:8
相关论文
共 22 条
[1]  
[Anonymous], 1999, Fuzzy Cluster Analysis
[2]  
BABCOCK B, 2002, P 21 ACM S PRINC DAT, P30
[3]   Clustering distributed data streams in peer-to-peer environments [J].
Bandyopadhyay, Sanghamitra ;
Giannella, Chris ;
Maulik, Ujjwal ;
Kargupta, Hillol ;
Liu, Kun ;
Datta, Souptik .
INFORMATION SCIENCES, 2006, 176 (14) :1952-1985
[4]   Online clustering of parallel data streams [J].
Beringer, Juergen ;
Huellermeier, Eyke .
DATA & KNOWLEDGE ENGINEERING, 2006, 58 (02) :180-204
[5]  
Bezdek J. C., Pattern Recognition With Fuzzy Objective Function Algorithms
[6]   A methodology for dynamic data mining based on fuzzy clustering [J].
Crespo, F ;
Weber, R .
FUZZY SETS AND SYSTEMS, 2005, 150 (02) :267-284
[7]   Clustering data streams: Theory and practice [J].
Guha, S ;
Meyerson, A ;
Mishra, N ;
Motwani, R ;
O'Callaghan, L .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (03) :515-528
[8]   Clustering data streams [J].
Guha, S ;
Mishra, N ;
Motwani, R ;
O'Callaghan, L .
41ST ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2000, :359-366
[9]   Pattern recognition in time series database: A case study on financial database [J].
Huang, Yan-Ping ;
Hsu, Chung-Chian ;
Wang, Sheng-Hsuan .
EXPERT SYSTEMS WITH APPLICATIONS, 2007, 33 (01) :199-205
[10]   Data clustering: A review [J].
Jain, AK ;
Murty, MN ;
Flynn, PJ .
ACM COMPUTING SURVEYS, 1999, 31 (03) :264-323