Cluster-based sampling approaches to imbalanced data distributions

被引:0
|
作者
Yen, Show-Jane [1 ]
Lee, Yue-Shi [1 ]
机构
[1] Ming Chuan Univ, Dept Comp Sci & Informat Engn, Taoyuan 333, Taiwan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For classification problem, the training data will significantly influence the classification accuracy. When the data set is highly unbalanced, classification algorithms tend to degenerate by assigning all cases to the most common outcome. Hence, it is important to select the suitable training data for classification in the imbalanced class distribution problem. In this paper, we propose cluster-based under-sampling approaches for selecting the representative data as training data to improve the classification accuracy in the imbalanced class distribution environment. The basic classification algorithm of neural network model is considered. The experimental results show that our cluster-based under-sampling approaches outperform the other under-sampling techniques in the previous studies.
引用
收藏
页码:427 / 436
页数:10
相关论文
共 50 条
  • [41] Autonomic active learning strategy using cluster-based ensemble classifier for concept drifts in imbalanced data stream
    Halder, Bohnishikha
    Hasan, K. M. Azharul
    Amagasa, Toshiyuki
    Ahmed, Md Manjur
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 231
  • [42] Cluster-based approaches to solvation and surface chemistry.
    Gordon, MS
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2001, 222 : U408 - U408
  • [43] ALTERNATIVE APPROACHES TO CLUSTER-BASED MARKET-SEGMENTATION
    GREEN, PE
    KRIEGER, AM
    JOURNAL OF THE MARKET RESEARCH SOCIETY, 1995, 37 (03): : 221 - 239
  • [44] GIR-based ensemble sampling approaches for imbalanced learning
    Tang, Bo
    He, Haibo
    PATTERN RECOGNITION, 2017, 71 : 306 - 319
  • [45] Cluster-Based Cooperative Data Service for VANETs
    Shi, Yongyue
    Peng, Xiao-Hong
    Shen, Hang
    Bai, Guangwei
    WIRELESS INTERNET (WICON 2017), 2018, 230 : 119 - 129
  • [46] Cluster-Based Prediction for Batteries in Data Centers
    Haider, Syed Naeem
    Zhao, Qianchuan
    Li, Xueliang
    ENERGIES, 2020, 13 (05)
  • [47] A Cluster-Based Cooperative Data Transmission in VANETs
    Fu, Qi
    Chen, Anhua
    Jiang, Yunxia
    Tang, Mingdong
    COLLABORATE COMPUTING: NETWORKING, APPLICATIONS AND WORKSHARING, COLLABORATECOM 2016, 2017, 201 : 563 - 568
  • [48] Localization techniques for cluster-based data grid
    Hsu, CH
    Lin, GH
    Li, KC
    Yang, CT
    DISTRIBUTED AND PARALLEL COMPUTING, 2005, 3719 : 83 - 92
  • [49] Cluster-based Data Reduction for Persistent Homology
    Moitra, Anindya
    Malott, Nicholas O.
    Wilsey, Philip A.
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 327 - 334
  • [50] Cluster-based distributed data collecting system
    College of Computer Science and Technology, Jilin University, Changchun 130012, China
    不详
    Jisuanji Gongcheng, 2006, 14 (46-48):