A Density-Based Fuzzy Clustering Algorithm Using Multi-Representatives Points Per Cluster Based on a New Distance Measure Using KNN Algorithm

被引:0
作者
Klidbary, S. Haghzad [1 ]
Javadian, M. [2 ]
机构
[1] Univ Zanjan, Fac Engn, Dept Elect & Comp Engn, Zanjan, Iran
[2] Shahid Beheshti Univ, Sch Elect Engn, Tehran, Iran
来源
INTERNATIONAL JOURNAL OF ENGINEERING | 2025年 / 38卷 / 12期
关键词
Clustering; Similarity Measure; Fuzzy Logic; Ink Drop Spread; Active Learning Method; High-Dimensional Clustering; ACTIVE LEARNING-METHOD; FAST SEARCH; FIND;
D O I
10.5829/ije.2025.38.12c.05
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In analyzing phenomena around us, clustering is among the most commonly used techniques in machine learning for comparing, and categorizing them into different groups based on intrinsic features. One of the main challenges facing clustering algorithms is selecting a suitable representative for each cluster. Existing algorithms often choose a single representative, which can lead to suboptimal performance on many datasets (especially asymmetric datasets). This process is completely dependent on the type of internal distribution of the clusters, and that single point may not be a suitable representative for that cluster. The proposed algorithm for dealing with datasets, inspired by the fuzzy ALM method and avoiding complex formulas, and calculations, initially breaks the system down into simpler (twodimensional) systems. After spreading ink drops, by finding the vertical Narrow path and the horizontal narrow path, it selects a set of points as the representation of each cluster. The proposed algorithm, unlike many conventional algorithms, provides a representative set for each cluster and also enhances the algorithm's performance in dealing with datasets that have an asymmetric structure by introducing a new distance measure based on the KNN method and utilizing the set of prime numbers. The Accuracy, F1-Score, and AMI achieved when working with many low-dimensional, and high-dimensional datasets has been higher compared to algorithms such as FUALM, HiDUALM, K-Means, DBSCAN, DENCLUE and IRFLLRR and in some cases, the achieved accuracy has been equal to 100 percent.
引用
收藏
页码:2865 / 2876
页数:12
相关论文
共 45 条
[31]   LatLRR for subspace clustering via reweighted Frobenius norm minimization [J].
Liu, Zhuo ;
Hu, Dong ;
Wang, Zhi ;
Gou, Jianping ;
Jia, Tao .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 224
[32]   Self-Adaptive Multiprototype-Based Competitive Learning Approach: A k-Means-Type Algorithm for Imbalanced Data Clustering [J].
Lu, Yang ;
Cheung, Yiu-Ming ;
Tang, Yuan Yan .
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) :1598-1612
[33]  
Melnykov V, 2015, Partitional clustering algorithms, P1, DOI [10.1007/978-3-319-09259-11, DOI 10.1007/978-3-319-09259-11]
[34]   Data mining in soft computing framework: A survey [J].
Mitra, S ;
Pal, SK ;
Mitra, P .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (01) :3-14
[35]  
Nielsen F., 2016, Introduction to HPC With MPI for Data Science. Undergraduate Topics in Computer Science, DOI [10.1007/978-3-319-21903-5.pdf, DOI 10.1007/978-3-319-21903-5.PDF]
[36]   Data clustering: application and trends [J].
Oyewole, Gbeminiyi John ;
Thopil, George Alex .
ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (07) :6439-6475
[37]  
Reddy M., 2017, INT J COMP SCI TRAND, V5, P5
[38]   Clustering by fast search and find of density peaks [J].
Rodriguez, Alex ;
Laio, Alessandro .
SCIENCE, 2014, 344 (6191) :1492-1496
[39]   Actor-critic-based ink drop spread as an intelligent controller [J].
Sagha, Hesam ;
Afrakoti, Iman Esmaili Paeen ;
Bagherishouraki, Saeed .
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2013, 21 (04) :1015-1034
[40]   Real-Time IDS Using Reinforcement Learning [J].
Sagha, Hesam ;
Shouraki, Saeed Bagheri ;
Khasteh, Hosein ;
Dehghani, Mahdi .
2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL II, PROCEEDINGS, 2008, :593-+