Enhanced synchronization-inspired clustering for high-dimensional data

被引:16
作者
Chen, Lei [1 ]
Guo, Qinghua [1 ]
Liu, Zhaohua [1 ]
Zhang, Shiwen [1 ]
Zhang, Hongqiang [1 ]
机构
[1] Hunan Univ Sci & Technol, Sch Informat & Elect Engn, Xiangtan, Peoples R China
基金
中国国家自然科学基金;
关键词
Synchronization-inspired; Clustering; High-dimensional dataset; Local density; METRICS; PCA;
D O I
10.1007/s40747-020-00191-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The synchronization-inspired clustering algorithm (Sync) is a novel and outstanding clustering algorithm, which can accurately cluster datasets with any shape, density and distribution. However, the high-dimensional dataset with high dimensionality, high noise, and high redundancy brings some new challenges for the synchronization-inspired clustering algorithm, resulting in a significant increase in clustering time and a decrease in clustering accuracy. To address these challenges, an enhanced synchronization-inspired clustering algorithm, namely SyncHigh, is developed in this paper to quickly and accurately cluster the high-dimensional datasets. First, a PCA-based (Principal Component Analysis) dimension purification strategy is designed to find the principal components in all attributes. Second, a density-based data merge strategy is constructed to reduce the number of objects participating in the synchronization-inspired clustering algorithm, thereby speeding up clustering time. Third, the Kuramoto Model is enhanced to deal with mass differences between objects caused by the density-based data merge strategy. Finally, extensive experimental results on synthetic and real-world datasets show the effectiveness and efficiency of our SyncHigh algorithm.
引用
收藏
页码:203 / 223
页数:21
相关论文
共 25 条
[21]   A new fractal approach for describing induced-fracture porosity/permeability/ compressibility in stimulated unconventional reservoirs [J].
Sheng, Guanglong ;
Su, Yuliang ;
Wang, Wendong .
JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2019, 179 :855-866
[22]   Fast Adaptive K-Means Subspace Clustering for High-Dimensional Data [J].
Wang, Xiao-Dong ;
Chen, Rung-Ching ;
Yan, Fei ;
Zeng, Zhi-Qiang ;
Hong, Chao-Qun .
IEEE ACCESS, 2019, 7 :42639-42651
[23]   Adaptive multi-view subspace clustering for high-dimensional data [J].
Yan, Fei ;
Wang, Xiao-dong ;
Zeng, Zhi-qiang ;
Hong, Chao-qun .
PATTERN RECOGNITION LETTERS, 2020, 130 :299-305
[24]   LE & LLE Regularized Nonnegative Tucker Decomposition for clustering of high dimensional datasets [J].
Yin, Wanguang ;
Ma, Zhengming .
NEUROCOMPUTING, 2019, 364 :77-94
[25]   A new and fast waterflooding optimization workflow based on INSIM-derived injection efficiency with a field application [J].
Zhao, Hui ;
Xu, Lingfei ;
Guo, Zhenyu ;
Liu, Wei ;
Zhang, Qi ;
Ning, Xuewei ;
Li, Guohao ;
Shi, Lihua .
JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2019, 179 :1186-1200