A dynamic data granulation through adjustable fuzzy clustering

被引：29

作者：

Pedrycz, Witold ^{[1
,2
]}

机构：

[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6R 2G7, Canada

[2] Polish Acad Sci, Syst Res Inst, PL-01447 Warsaw, Poland

来源：

PATTERN RECOGNITION LETTERS | 2008年 / 29卷 / 16期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Dynamic clustering; Cluster split and cluster merge; Data dynamics; Data snapshot; Fuzzy clustering; Reconstruction criterion;

D O I：

10.1016/j.patrec.2008.07.001

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this study, we develop a concept of dynamic data granulation realized in presence of incoming data organized in the form of so-called data snapshots. For each of these snapshots we reveal a structure by running fuzzy clustering. The proposed algorithm of adjustable fuzzy C-means (FCM) exhibits a number Of useful features which directly associate with the dynamic nature of the underlying data: (a) the number of clusters is adjusted from one data snapshot to another in order to capture the varying structure of patterns and its complexity, (b) continuity between the consecutively discovered structures is retained, viz the clusters formed for a certain data snapshot are constructed as a result of evolving the clusters discovered in the predeceasing snapshot. We present a detailed clustering algorithm in which the mechanisms of adjustment of information granularity (the number of clusters) become the result of solutions to well-defined optimization tasks. The cluster splitting is guided by conditional fuzzy C-means (FCM) while cluster merging involves two neighboring prototypes. The criterion used to control the level of information granularity throughout the process is guided by a reconstruction criterion which quantifies an error resulting from pattern granulation and cle-granulation. Numeric experiments provide a suitable illustration of the approach. (C) 2008 Elsevier B.V. All rights reserved.

引用

页码：2059 / 2066

页数：8

共 22 条

[1]

[Anonymous], 1999, Fuzzy Cluster Analysis

[2]

BABCOCK B, 2002, P 21 ACM S PRINC DAT, P30

[3] Clustering distributed data streams in peer-to-peer environments [J].