An automatic three-way clustering method based on sample similarity

被引:2
作者
Xiuyi Jia
Ya Rao
Weiwei Li
Sichun Yang
Hong Yu
机构
[1] Nanjing University of Science and Technology,School of Computer Science and Engineering
[2] Nanjing University of Aeronautics and Astronautics,College of Astronautics
[3] Anhui University of Technology,School of Computer Science and Technology
[4] Chongqing University of Posts and Telecommunications,College of Computer Science and Technology
来源
International Journal of Machine Learning and Cybernetics | 2021年 / 12卷
关键词
Three-way decisions; Three-way clustering; Sample similarity;
D O I
暂无
中图分类号
学科分类号
摘要
The three-way clustering is an extension of traditional clustering by adding the concept of fringe region, which can effectively solve the problem of inaccurate decision-making caused by inaccurate information or insufficient data in traditional two-way clustering methods. The existing three-way clustering works often select the appropriate number of clusters and the thresholds for three-way partition according to subjective tuning. However, the method of fixing the number of clusters and the thresholds of the partition cannot automatically select the optimal number of clusters and partition thresholds for different data sets with different sizes and densities. To address the above problem, this paper proposed an improved three-way clustering method. First, we define the roughness degree by introducing the sample similarity to measure the uncertainty of the fringe region. Moreover, based on the roughness degree, we define a novel partitioning validity index to measure the clustering partitions and propose an automatic threshold selection method. Second, based on the concept of sample similarity, we introduce the intra-class similarity and the inter-class similarity to describe the quantitative change of the relationship between the sample and the clusters, and define a novel clustering validity index to measure the clustering performance under different numbers of clusters through the integration of the above two kinds of similarities. Furthermore, we propose an automatic cluster number selection method. Finally, we give an automatic three-way clustering approach by combining the proposed threshold selection method and the cluster number selection method. The comparison experiments demonstrate the effectiveness of our proposal.
引用
收藏
页码:1545 / 1556
页数:11
相关论文
共 113 条
[1]  
Afridi MK(2018)A three-way clustering approach for handling missing data using GTRS Int J Approx Reason 98 11-24
[2]  
Azam N(1973)A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters J Cybern 3 32-57
[3]  
Yao J(1967)On some invariant criteria for grouping data J Am Stat Assoc 62 1159-1178
[4]  
Alanazi E(2017)Three-way decisions based on semi-three-way decision spaces Inf Sci 382–383 415-440
[5]  
Dunn J(1999)Minimum cost attribute reduction in decision-theoretic rough set models ACM Comput Surv 31 264-323
[6]  
Friedman HP(2013)Generalized attribute reduct in rough set theory Inf Sci 219 151-167
[7]  
Rubin J(2016)A multiphase cost-sensitive learning method based on the multiclass three-way decision-theoretic rough set model Knowl Based Syst 91 204-218
[8]  
Hu B(2019)Similarity-based attribute reduction in rough set theory: a clustering perspective Inf Sci 485 248-262
[9]  
Jain AK(2020)Cost-sensitive sequential three-way decision modeling using a deep neural network Int J Mach Learn Cybernet 11 1047-1060
[10]  
Murty MN(2017)Three-way cognitive concept learning via multi-granularity Int J Approx Reason 85 68-78