Convex Optimization Techniques for High-Dimensional Data Clustering Analysis: A Review

被引:0
作者
Yousif, Ahmed Yacoub [1 ,2 ]
Al-Sarray, Basad [3 ]
机构
[1] University of Baghdad, College of Science, Mathematics Department, Baghdad
[2] University of Technology, Department of Applied Sciences, Baghdad
[3] University of Baghdad, College of Science, Computer Science Department
来源
Iraqi Journal for Computer Science and Mathematics | 2024年 / 5卷 / 03期
关键词
Augmented Lagrangian algorithm; Convex clustering; Global optimality; High-dimensional data; Regularization; Semi-smooth Newton; Unsupervised learning;
D O I
10.52866/ijcsm.2024.05.03.022
中图分类号
学科分类号
摘要
Clustering techniques have been instrumental in discerning patterns and relationships within datasets in data analytics and unsupervised machine learning. Traditional clustering algorithms struggle to handle real-world data analysis problems where the number of clusters is not readily identifiable. Moreover, they face challenges in determining the optimal number of clusters for high-dimensional datasets. Consequently, there is a demand for enhanced, adaptable and efficient techniques. Convex clustering, rooted in a rich mathematical framework, has steadily emerged as a pivotal alternative to traditional techniques. It amalgamates the strengths of conventional approaches while ensuring robustness and guaranteeing globally optimal solutions. This review offers an in-depth exploration of convex clustering, detailing its formulation, challenges and practical applications. It examines synthetic datasets, which serve as foundational platforms for academic exploration, emphasizing their interactions with the semi-smooth Newton augmented Lagrangian (SSNAL) algorithm. Convex clustering provides a robust theoretical foundation, but challenges, including computational limitations with expansive datasets and noise management in high-dimensional contexts, persist. Hence, the paper discusses current challenges and prospective future directions in the domain. This research aims to illuminate the potency and potential of convex clustering in modern data analytics, highlighting its robustness, flexibility and adaptability across diverse datasets and applications. © 2024 College of Education, Al-Iraqia University. All rights reserved.
引用
收藏
页码:378 / 398
页数:20
相关论文
共 118 条
  • [1] Ngo G. C., Macabebe E. Q. B., Image segmentation using K-means color quantization and density-based spatial clustering of applications with noise (DBSCAN) for hotspot detection in photovoltaic modules, 2016 IEEE Region 10 Conference (TENCON), pp. 1614-1618, (2016)
  • [2] Rudrappa G., Cloud Classification Using Ground Based Images Using CBIR and K-Means Clustering, Biosci Biotechnol Res Commun, 13, 13, pp. 95-99, (2020)
  • [3] Karim M. R., Et al., Deep learning-based clustering approaches for bioinformatics, Brief Bioinform, 22, 1, pp. 393-415, (2021)
  • [4] Nascimento M., Et al., Independent Component Analysis (ICA) based-clustering of temporal RNA-seq data, PLoS One, 12, 7, (2017)
  • [5] Hussein S. A., Zahid I. A., Improved Naked Mole-Rat Algorithm Based on Variable Neighborhood Search for the N-Queens Problem, Iraqi Journal of Science, (2024)
  • [6] Hussein S. A., Yousif A. Y., An Improved Meerkat Clan Algorithm for Solving 0-1 Knapsack Problem, Iraqi Journal of Science, pp. 773-784, (2022)
  • [7] Ezugwu A. E., Et al., A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects, Eng Appl Artif Intell, 110, (2022)
  • [8] Saxena A., Et al., A review of clustering techniques and developments, Neurocomputing, 267, pp. 664-681, (2017)
  • [9] Salman N. H., Mohammed S. N., Image Segmentation Using PSO-Enhanced K-Means Clustering and Region Growing Algorithms, Iraqi Journal of Science, pp. 4988-4998, (2021)
  • [10] Franti P., Sieranoja S., K-means properties on six clustering benchmark datasets, Applied Intelligence, 48, 12, pp. 4743-4759, (2018)