A fuzzy rough set-based undersampling approach for imbalanced data

被引:1
|
作者
Zhang, Xiao [1 ]
He, Zhaoqian [1 ]
Yang, Yanyan [2 ]
机构
[1] Xian Univ Technol, Dept Appl Math, 58 Yanxiang Rd, Xian 710054, Shanxi, Peoples R China
[2] Beijing Jiaotong Univ, Sch Software Engn, Beixiaguan Rd, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced data; Fuzzy rough sets; Undersampling; Instance selection; CLASSIFIERS; REDUCTION;
D O I
10.1007/s13042-023-02064-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How to effectively handle imbalanced data is one of the hot issues in the fields of machine learning and data mining. Undersampling is a popular technique of dealing with imbalanced data. The aim of undersampling is to select an instance subset from the majority class of an imbalanced dataset and then make the dataset balanced. However, the traditional undersampling approaches may lead to the information loss of majority class instances. Therefore, on the basis of the concept of the importance degree of a fuzzy granule, a measure criterion of selecting representative instances from the majority class is presented in this paper by considering the fuzzy relations between the k-nearest neighbors of a majority class instance and the minority class instances. Then, we put forward an undersampling approach based on fuzzy rough sets (USFRS). With the proposed USFRS, the representativeness of the selected majority class instances can be guaranteed and the information loss due to undersampling can be reduced to the utmost extent. Furthermore, USFRS is compared with the relative undersampling methods, and the difference of the experimental results is analyzed by the statistic test. The experimental results demonstrate that USFRS performs well in classification for imbalanced data.
引用
收藏
页码:2799 / 2810
页数:12
相关论文
共 50 条
  • [21] A rough set-based Competitive Intelligence approach for anticipating action
    Ben Sassi, Dhekra
    Frini, Anissa
    Chaieb, Marouene
    Karaa, Wahiba Ben Abdessalem
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 204
  • [22] Rough Set-Based Incremental Learning Approach to Face Recognition
    Chen, Xuguang
    Ziarko, Wojciech
    ROUGH SETS AND CURRENT TRENDS IN COMPUTING, PROCEEDINGS, 2010, 6086 : 356 - 365
  • [23] Rough set-based feature selection for weakly labeled data
    Campagner, Andrea
    Ciucci, Davide
    Huellermeier, Eyke
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2021, 136 : 150 - 167
  • [24] Neighbourhood-based undersampling approach for handling imbalanced and overlapped data
    Vuttipittayamongkol, Pattaramon
    Elyan, Eyad
    INFORMATION SCIENCES, 2020, 509 : 47 - 70
  • [25] Rough set-based intelligent agent grid data management
    Chen, Jia
    Liu, Di
    2007 INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS PROCEEDINGS, VOLS 1 AND 2: VOL 1: COMMUNICATION THEORY AND SYSTEMS; VOL 2: SIGNAL PROCESSING, COMPUTATIONAL INTELLIGENCE, CIRCUITS AND SYSTEMS, 2007, : 937 - +
  • [26] SaintEtiQ: a fuzzy set-based approach to database summarization
    Raschia, G
    Mouaddib, N
    FUZZY SETS AND SYSTEMS, 2002, 129 (02) : 137 - 162
  • [27] Feature selection in mixed data: A method using a novel fuzzy rough set-based information entropy
    Zhang, Xiao
    Mei, Changlin
    Chen, Degang
    Li, Jinhai
    PATTERN RECOGNITION, 2016, 56 : 1 - 15
  • [28] A Rough Approximation of Fuzzy Soft Set-Based Decision-Making Approach in Supplier Selection Problem
    Chatterjee, A.
    Mukherjee, S.
    Kar, S.
    FUZZY INFORMATION AND ENGINEERING, 2018, 10 (02) : 178 - 195
  • [29] Fuzzy Rough Set Approach Based Classifier
    Singh, Alpna
    Tiwari, Aruna
    Naegi, Sujata
    SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, PT I, 2011, 7076 : 550 - 558
  • [30] Development of a rough set-based fuzzy neural network for online monitoring of microdrilling
    ZhaoJun Yang
    Xue Li
    QingXiang Jia
    YanHong Sun
    The International Journal of Advanced Manufacturing Technology, 2009, 41 : 219 - 225