Big data analysis using a parallel ensemble clustering architecture and an unsupervised feature selection approach

被引:2
作者
Wang, Yubo [1 ]
Saraswat, Shelesh Krishna [2 ]
Komari, Iraj Elyasi [3 ]
机构
[1] Corp Taiji Comp Corp Ltd, China Elect Technol Grp, Innovat Res Inst, Beijing 100070, Peoples R China
[2] GLA Univ Mathura, Dept Elect & Commun, Mathura, Uttar Pradesh, India
[3] Islamic Azad Univ, Dept Comp Engn, Andimeshk Branch, Andimeshk, Iran
关键词
Ensemble clustering; Consensus selection; Cluster merit; Parallel clustering architecture; PREDICTION; ALGORITHM; CRITERION;
D O I
10.1016/j.jksuci.2022.11.016
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble clustering is known as a challenging research direction in data mining. The results of several individual clustering methods are combined to produce higher quality final clusters. This study introduces a parallel hierarchical clustering approach based on the divide-and-conquer strategy, which is an attempt to realize faster and more efficient ensemble clustering. Here, we propose a cluster consensus selection approach that selects a subset of meriting primary clusters to participate in the final consensus. Considering the sample-cluster and cluster-cluster similarity on the selected primary clusters, we form the final clusters based on the clusters clustering technique as a consensus function. In addition, the proposed scheme is equipped with an unsupervised feature selection approach to remove features that do not contribute significantly to clustering. Extensive evaluations have been performed on datasets of different dimensions from the University of California Irvine (UCI) machine learning repository. The simulation results guarantee the efficiency of the proposed scheme and improves the average performance between 6% and 24% compared to the state-of-the-art clustering methods.@2022 The Author(s). Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:270 / 282
页数:13
相关论文
共 44 条
[1]   Clustering ensemble selection considering quality and diversity [J].
Abbasi, Sadr-olah ;
Nejatian, Samad ;
Parvin, Hamid ;
Rezaie, Vahideh ;
Bagherifard, Karamolah .
ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (02) :1311-1340
[2]   Hierarchical cluster ensemble selection [J].
Akbari, Ebrahim ;
Dahlan, Halina Mohamed ;
Ibrahim, Roliana ;
Alizadeh, Hosein .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 39 :146-156
[3]  
Berahmand K., 2020, J SAUD U COMPUT INFO
[4]   A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix [J].
Berahmand, Kamal ;
Mohammadi, Mehrnoush ;
Faroughi, Azadeh ;
Mohammadiani, Rojiar Pir .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2022, 25 (02) :869-888
[5]   scEFSC: Accurate single-cell RNA-seq data analysis via ensemble consensus clustering based on multiple feature selections [J].
Bian, Chuang ;
Wang, Xubin ;
Su, Yanchi ;
Wang, Yunhe ;
Wong, Ka-chun ;
Li, Xiangtao .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 :2181-2197
[6]  
Cai D., 2010, P 16 ACM SIGKDD INT, P333, DOI DOI 10.1145/1835804.1835848
[7]   Traffic Shifting based Resource Optimization in Aggregated IoT Communication [J].
Chapnevis, Amirahmad ;
Guvenc, Ismail ;
Bulut, Eyuphan .
PROCEEDINGS OF THE 2020 IEEE 45TH CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN 2020), 2020, :233-243
[8]   A Multi-View Human Action recognition System in Limited Data case using multi-stream CNN [J].
Chenarlogh, Vahid Ashkani ;
Razzazi, Farbod ;
Mohammadyahya, Najmeh .
2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019), 2019,
[9]   Particle Swarm Clustering in clustering ensembles: Exploiting pruning and alignment free consensus [J].
de Oliveira, Jose Valente ;
Szabo, Alexandre ;
de Castro, Leandro Nunes .
APPLIED SOFT COMPUTING, 2017, 55 :141-153
[10]   K-centroid link: a novel hierarchical clustering linkage method [J].
Dogan, Alican ;
Birant, Derya .
APPLIED INTELLIGENCE, 2022, 52 (05) :5537-5560