A fuzzy rough set-based undersampling approach for imbalanced data

被引:1
|
作者
Zhang, Xiao [1 ]
He, Zhaoqian [1 ]
Yang, Yanyan [2 ]
机构
[1] Xian Univ Technol, Dept Appl Math, 58 Yanxiang Rd, Xian 710054, Shanxi, Peoples R China
[2] Beijing Jiaotong Univ, Sch Software Engn, Beixiaguan Rd, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
Imbalanced data; Fuzzy rough sets; Undersampling; Instance selection; CLASSIFIERS; REDUCTION;
D O I
10.1007/s13042-023-02064-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
How to effectively handle imbalanced data is one of the hot issues in the fields of machine learning and data mining. Undersampling is a popular technique of dealing with imbalanced data. The aim of undersampling is to select an instance subset from the majority class of an imbalanced dataset and then make the dataset balanced. However, the traditional undersampling approaches may lead to the information loss of majority class instances. Therefore, on the basis of the concept of the importance degree of a fuzzy granule, a measure criterion of selecting representative instances from the majority class is presented in this paper by considering the fuzzy relations between the k-nearest neighbors of a majority class instance and the minority class instances. Then, we put forward an undersampling approach based on fuzzy rough sets (USFRS). With the proposed USFRS, the representativeness of the selected majority class instances can be guaranteed and the information loss due to undersampling can be reduced to the utmost extent. Furthermore, USFRS is compared with the relative undersampling methods, and the difference of the experimental results is analyzed by the statistic test. The experimental results demonstrate that USFRS performs well in classification for imbalanced data.
引用
收藏
页码:2799 / 2810
页数:12
相关论文
共 50 条
  • [41] A novel rough set-based approach for minimum vertex cover of hypergraphs
    Qian Zhou
    Xiaojun Xie
    Hua Dai
    Weizhi Meng
    Neural Computing and Applications, 2022, 34 : 21793 - 21808
  • [42] A relational perspective of attribute reduction in rough set-based data analysis
    Fan, Tuan-Fang
    Liau, Churn-Jung
    Liu, Duen-Ren
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 213 (01) : 270 - 278
  • [43] A Rough Set-Based Data Analysis in Power System for Fault Diagnosis
    Ren, Dajiang
    INFORMATION COMPUTING AND APPLICATIONS, PT II, 2011, 244 : 265 - 272
  • [44] RSFD: A rough set-based feature discretization method for meteorological data
    Zeng, Lirong
    Chen, Qiong
    Huang, Mengxing
    FRONTIERS IN ENVIRONMENTAL SCIENCE, 2022, 10
  • [45] Rough set-based approach to feature selection in customer relationship management
    Tseng, Tzu-Liang
    Huang, Chun-Che
    OMEGA-INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE, 2007, 35 (04): : 365 - 383
  • [46] A novel rough set-based approach for minimum vertex cover of hypergraphs
    Zhou, Qian
    Xie, Xiaojun
    Dai, Hua
    Meng, Weizhi
    Neural Computing and Applications, 2022, 34 (24) : 21793 - 21808
  • [47] A novel rough set-based approach for minimum vertex cover of hypergraphs
    Zhou, Qian
    Xie, Xiaojun
    Dai, Hua
    Meng, Weizhi
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (24): : 21793 - 21808
  • [48] A rough set-based multiple criteria linear programming approach for classification
    Zhang, Zhiwang
    Shi, Yong
    Zhang, Peng
    Gao, Guangxia
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 2, 2008, 5102 : 476 - +
  • [49] Stock Trading using PSEC and RSPOP: A novel evolving rough set-based neuro-fuzzy approach
    Ang, KK
    Quek, C
    2005 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-3, PROCEEDINGS, 2005, : 1032 - 1039
  • [50] A fuzzy rough set-based horse herd optimization algorithm for map reduce framework for customer behavior data
    Sudha, D.
    Krishnamurthy, M.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (08) : 4721 - 4753