Adaptive K-means Algorithm Based on Three-Way Decision

被引：2

作者：

Peng, Yihang ^{[1
,2
]}

Zhang, Qinghua ^{[1
,2
]}

Ai, Zhihua ^{[1
,2
]}

Zhi, Xuechao ^{[1
,2
]}

机构：

[1] Chongqing Key Lab Computat Intelligence, Chongqing, Peoples R China

[2] Chongqing Univ Posts & Telecommun, Chongqing, Peoples R China

来源：

ROUGH SETS, IJCRS 2022 | 2022年 / 13633卷

基金：

中国国家自然科学基金;

关键词：

Three-way clustering; Three-way decision; Neighborhood; K-means; Accuracy of approximation; CLUSTERING-ALGORITHM;

D O I：

10.1007/978-3-031-21244-4_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The focus of traditional k-means and its related improved algorithms are to find the initial cluster centers and the appropriate number of clusters, and allocate the samples to the clusters with clear boundaries. These algorithms cannot solve the problems of clusters with imprecise boundaries and inaccurate decisions due to inaccurate information or insufficient data. Three-way clustering can solve this problem to a certain extent. However, most of the existing three-way clustering algorithms divide all clusters into three regions with the same threshold, or divide three regions subjectively. These algorithms are not suitable for clusters with different sizes and densities. To solve the above problems, an adaptive k-means algorithm based on three-way decision is proposed in this paper. First, the traditional clustering results are taken as target set and core region. The distance between each sample in the target set is used as the candidate neighborhood radius threshold. At the same time, neighborhood relationship is introduced to calculate the accuracy of approximation, upper and lower approximation of the target set under the current neighborhood relationship. Second, a boundary control coefficient is defined according to the accuracy of approximation, and as many abnormal data as possible are classified into boundary regions to transform traditional clustering into three-way clustering adapted to different sizes and densities. Finally, five indexes are compared on UCI data set and artificial data set, and the experimental results indicate the effectiveness of the proposed algorithm.

引用

页码：390 / 404

页数：15

共 35 条

[1] A three-way clustering approach for handling missing data using GTRS [J].

Afridi, Mohammad Khan ;

Azam, Nouman ;

Yao, JingTao ;

Alanazi, Eisa .

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2018, 98 :11-24

[2] Neighborhood rough set-based three-way clustering considering attribute correlations: An approach to classification of potential gout groups [J].

Chu, Xiaoli ;

Sun, Bingzhen ;

Li, Xue ;

Han, Keyu ;

Wu, JiaQi ;

Zhang, Yan ;

Huang, Qingchun .

INFORMATION SCIENCES, 2020, 535 :28-41

[3]

Dua D, 2019, UCI MACHINE LEARNING

[4] K-means properties on six clustering benchmark datasets [J].

Franti, Pasi ;

Sieranoja, Sami .

APPLIED INTELLIGENCE, 2018, 48 (12) :4743-4759

[5]

Hong Y., 2016, Peak Data Sci., V5, P31

[6] Numerical attribute reduction based on neighborhood granulation and rough approximation [J].

Hu, Qing-Hua ;

Yu, Da-Ren ;

Xie, Zong-Xia .

Ruan Jian Xue Bao/Journal of Software, 2008, 19 (03) :640-649

[7] A rough set model based on fuzzifying neighborhood systems [J].

Li, Lingqiang ;

Jin, Qiu ;

Yao, Bingxue ;

Wu, Jiachao .

SOFT COMPUTING, 2020, 24 (08) :6085-6099

[8] MR-BIRCH: A scalable MapReduce-based BIRCH clustering algorithm [J].

Li, Yufeng ;

Jiang, HaiTian ;

Lu, Jiyong ;

Li, Xiaozhong ;

Sun, Zhiwei ;

Li, Min .

JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 40 (03) :5295-5305

[9] Improving Density Peak Clustering by Automatic Peak Selection and Single Linkage Clustering [J].

Lin, Jun-Lin ;

Kuo, Jen-Chieh ;

Chuang, Hsing-Wang .

SYMMETRY-BASEL, 2020, 12 (07)

[10] Variations on the Clustering Algorithm BIRCH [J].

Lorbeer, Boris ;

Kosareva, Ana ;

Deva, Bersant ;

Softic, Dzenan ;

Ruppel, Peter ;

Kuepper, Axel .

BIG DATA RESEARCH, 2018, 11 :44-53

← 1 2 3 4 →