A Density-Based Fuzzy Clustering Algorithm Using Multi-Representatives Points Per Cluster Based on a New Distance Measure Using KNN Algorithm

被引:0
作者
Klidbary, S. Haghzad [1 ]
Javadian, M. [2 ]
机构
[1] Univ Zanjan, Fac Engn, Dept Elect & Comp Engn, Zanjan, Iran
[2] Shahid Beheshti Univ, Sch Elect Engn, Tehran, Iran
来源
INTERNATIONAL JOURNAL OF ENGINEERING | 2025年 / 38卷 / 12期
关键词
Clustering; Similarity Measure; Fuzzy Logic; Ink Drop Spread; Active Learning Method; High-Dimensional Clustering; ACTIVE LEARNING-METHOD; FAST SEARCH; FIND;
D O I
10.5829/ije.2025.38.12c.05
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In analyzing phenomena around us, clustering is among the most commonly used techniques in machine learning for comparing, and categorizing them into different groups based on intrinsic features. One of the main challenges facing clustering algorithms is selecting a suitable representative for each cluster. Existing algorithms often choose a single representative, which can lead to suboptimal performance on many datasets (especially asymmetric datasets). This process is completely dependent on the type of internal distribution of the clusters, and that single point may not be a suitable representative for that cluster. The proposed algorithm for dealing with datasets, inspired by the fuzzy ALM method and avoiding complex formulas, and calculations, initially breaks the system down into simpler (twodimensional) systems. After spreading ink drops, by finding the vertical Narrow path and the horizontal narrow path, it selects a set of points as the representation of each cluster. The proposed algorithm, unlike many conventional algorithms, provides a representative set for each cluster and also enhances the algorithm's performance in dealing with datasets that have an asymmetric structure by introducing a new distance measure based on the KNN method and utilizing the set of prime numbers. The Accuracy, F1-Score, and AMI achieved when working with many low-dimensional, and high-dimensional datasets has been higher compared to algorithms such as FUALM, HiDUALM, K-Means, DBSCAN, DENCLUE and IRFLLRR and in some cases, the achieved accuracy has been equal to 100 percent.
引用
收藏
页码:2865 / 2876
页数:12
相关论文
共 45 条
[1]  
Abadi S., 2018, International Journal of Engineering and Technology, V7, P182, DOI [DOI 10.14419/IJET.V7I2.27.11491, 10.14419/ijet.v7i2.11491, DOI 10.14419/IJET.V7I2.11491]
[2]  
Aggarwal CC, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P61, DOI 10.1145/304181.304188
[3]  
Balcan MF, 2017, PR MACH LEARN RES, V70
[4]   Towards automating the discovery of certain innovative design principles through a clustering-based optimization technique [J].
Bandaru, Sunith ;
Deb, Kalyanmoy .
ENGINEERING OPTIMIZATION, 2011, 43 (09) :911-941
[5]  
Berkhin P, 2006, GROUPING MULTIDIMENSIONAL DATA: RECENT ADVANCES IN CLUSTERING, P25
[6]   A survey of density based clustering algorithms [J].
Bhattacharjee, Panthadeep ;
Mitra, Pinaki .
FRONTIERS OF COMPUTER SCIENCE, 2021, 15 (01)
[7]   Adaptive fuzzy clustering by fast search and find of density peaks [J].
Bie, Rongfang ;
Mehmood, Rashid ;
Ruan, Shanshan ;
Sun, Yunchuan ;
Dawood, Hussain .
PERSONAL AND UBIQUITOUS COMPUTING, 2016, 20 (05) :785-793
[8]   High-dimensional data clustering [J].
Bouveyron, C. ;
Girard, S. ;
Schmid, C. .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) :502-519
[9]   Density-based clustering [J].
Campello, Ricardo J. G. B. ;
Kroeger, Peer ;
Sander, Jorg ;
Zimek, Arthur .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (02)
[10]  
De Oliveira J.V., 2007, Advances in Fuzzy Clustering and Its Applications