A three-way clustering approach for handling missing data using GTRS

被引:108
作者
Afridi, Mohammad Khan [1 ]
Azam, Nouman [1 ]
Yao, JingTao [2 ]
Alanazi, Eisa [3 ]
机构
[1] Natl Univ Comp & Emerging Sci, Islamabad, Pakistan
[2] Univ Regina, Dept Comp Sci, Regina, SK S4S 0A2, Canada
[3] Umm Al Qura Univ, Dept Comp Sci, Mecca, Saudi Arabia
基金
加拿大自然科学与工程研究理事会;
关键词
Clustering; Three-way decisions; Game-theoretic rough sets; Missing data; Uncertainty; THEORETIC ROUGH SETS; INCOMPLETE DATA; FUZZY; ALGORITHM;
D O I
10.1016/j.ijar.2018.04.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an important data analysis task. It becomes a challenge in the presence of uncertainty due to reasons such as incomplete, missing or corrupted data. A three-way approach has recently been introduced to deal with uncertainty in clustering due to missing values. The essential idea is to make a deferment decision whenever it is not clear and possible to decide whether or not to include an object in a cluster. A key issue in the three-way approach is to determine the thresholds that are used to define the three types of decisions, namely, include an object in a cluster, exclude an object from a cluster, or delay (defer) the decision of inclusion or exclusion from a cluster. The existing studies do not sufficiently address the determination of thresholds and generally use its fix values. In this paper, we explore the use of game-theoretic rough set (GTRS) model to handle this issue. In particular, a game is defined where the determination of thresholds is approached based on a tradeoff between the properties of accuracy and generality of clusters. The determined thresholds are then used to induce three-way decisions for clustering uncertain objects. Experimental results on four datasets from UCI machine learning repository suggests that the GTRS significantly improves the generality while keeping similar levels of accuracy in comparison to other three-way and similar models. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:11 / 24
页数:14
相关论文
共 40 条
[31]  
Yao JT, 2008, 2008 INTERNATIONAL FORUM ON KNOWLEDGE TECHNOLOGY, P291
[32]   Rough Sets and Three-Way Decisions [J].
Yao, Yiyu .
ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2015, 2015, 9436 :62-73
[33]   Performance Monitoring for Vehicle Suspension System via Fuzzy Positivistic C-Means Clustering Based on Accelerometer Measurements [J].
Yin, Shen ;
Huang, Zenghui .
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2015, 20 (05) :2613-2620
[34]  
Yiyu Yao, 2012, Rough Sets and Current Trends in Computing. Proceedings 8th International Conference, RSCTC 2012, P1, DOI 10.1007/978-3-642-32115-3_1
[35]   A Framework of Three-Way Cluster Analysis [J].
Yu, Hong .
ROUGH SETS, IJCRS 2017, PT II, 2017, 10314 :300-312
[36]   A tree-based incremental overlapping clustering method using the three-way decision theory [J].
Yu, Hong ;
Zhang, Cong ;
Wang, Guoyin .
KNOWLEDGE-BASED SYSTEMS, 2016, 91 :189-203
[37]   A Three-Way Decisions Clustering Algorithm for Incomplete Data [J].
Yu, Hong ;
Su, Ting ;
Zeng, Xianhua .
ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2014, 2014, 8818 :765-776
[38]  
Zhang L, 2015, 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), P101, DOI 10.1109/ICASID.2015.7405670
[39]   Fuzzy C-Means clustering of incomplete data based on probabilistic information granules of missing values [J].
Zhang, Liyong ;
Lu, Wei ;
Liu, Xiaodong ;
Pedrycz, Witold ;
Zhong, Chongquan .
KNOWLEDGE-BASED SYSTEMS, 2016, 99 :51-70
[40]   Evaluation of uncertainty in mineral prospectivity mapping due to missing evidence: A case study with skarn-type Fe deposits in Southwestern Fujian Province, China [J].
Zuo, Renguang ;
Zhang, Zhenjie ;
Zhang, Daojun ;
Carranza, Emmanuel John M. ;
Wang, Haicheng .
ORE GEOLOGY REVIEWS, 2015, 71 :502-515