DenMG: Density-Based Member Generation for Ensemble Clustering

被引:0
作者
Du, Xueqin [1 ]
He, Yulin [1 ,2 ]
Fournier-Viger, Philippe [1 ]
Huang, Joshua Zhexue [1 ,2 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
[2] Shenzhen Univ, Guangdong Lab Artificial Intelligence & Digital E, Shenzhen, Peoples R China
来源
51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS PROCEEDINGS, ICPP 2022 | 2022年
基金
中国国家自然科学基金;
关键词
ensemble clustering; MMD; homocluster; heterocluster; neighborhood density; COMBINING MULTIPLE CLUSTERINGS;
D O I
10.1145/3547276.3548520
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Ensemble clustering is a popular approach for identifying clusters in data, which combines results from multiple clustering algorithms to obtain more accurate and robust clusters. However, the performance of ensemble clustering algorithms greatly depends on the quality of its members. Based on this observation, this paper proposes a density-based member generation (DenMG) algorithm that selects ensemble members by considering the distribution consistency. DenMG has two main components, which split sample points from a heterocluster and merge sample points to form a homocluster, respectively. The first component estimates two probability density functions ( p.d.f.s) based on an heterocluster's sample points, and represents them using a Gaussian distribution and a Gaussian mixture model. If random numbers generated by these two p.d.f.s are deemed to have different probability distributions, the heterocluster is split into smaller clusters. The second component merges clusters that have high neighborhood densities into a homocluster. This is done using an opposite-oriented criterion that measures neighborhood density. A series of experiments were conducted to demonstrate the feasibility and effectiveness of the proposed ensemble member generation algorithm. Results show that the proposed algorithm can generate high quality ensemble members and as a result yield better clustering than five state-of-the-art ensemble clustering algorithms.
引用
收藏
页数:7
相关论文
共 20 条
[11]   Data clustering: 50 years beyond K-means [J].
Jain, Anil K. .
PATTERN RECOGNITION LETTERS, 2010, 31 (08) :651-666
[12]  
Li T, 2008, P 2008 SIAM INT C DA, P798, DOI DOI 10.1137/1.9781611972788.72
[13]   Interval fuzzy spectral clustering ensemble algorithm for color image segmentation [J].
Liu, Han Qiang ;
Zhang, Qing ;
Zhao, Feng .
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (05) :5467-5476
[14]  
Maulik U., 2019, TENCON 2019 2019 IEE, P880
[15]   Ensembles of partitions via data resampling [J].
Minaei-Bidgoli, B ;
Topchy, A ;
Punch, WF .
ITCC 2004: INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: CODING AND COMPUTING, VOL 2, PROCEEDINGS, 2004, :188-192
[16]   Assessment of air quality monitoring networks using an ensemble clustering method in the three major metropolitan areas of Mexico [J].
Stolz, Tobias ;
Huertas, Maria E. ;
Mendoza, Alberto .
ATMOSPHERIC POLLUTION RESEARCH, 2020, 11 (08) :1271-1280
[17]  
Strehl A, 2002, EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, P93, DOI 10.1162/153244303321897735
[18]  
Topchy A, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P331
[19]   Hybrid clustering solution selection strategy [J].
Yu, Zhiwen ;
Li, Le ;
Gao, Yunjun ;
You, Jane ;
Liu, Jiming ;
Wong, Hau-San ;
Han, Guoqiang .
PATTERN RECOGNITION, 2014, 47 (10) :3362-3375
[20]  
Zhou ZH., 2012, ENSEMBLE METHODS FDN