The class-imbalance problem for high-dimensional class prediction

被引:7
|
作者
Lusa, Lara [1 ]
Blagus, Rok [1 ]
机构
[1] Univ Ljubljana, Inst Biostat & Med Informat, Ljubljana, Slovenia
来源
2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2 | 2012年
关键词
class-imbalance; high -dimensional data; classification;
D O I
10.1109/ICMLA.2012.223
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of class prediction studies is to develop rules to accurately predict the class membership of new subjects. The classifiers differ in the way they combine the values of the variables available for each subject. Frequently the classifiers are developed using class-imbalanced data, where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced data are often biased towards the majority class: they classify most new samples in the majority class and they do not accurately predict the minority class. Data are high-dimensional when the number of variables greatly exceeds the number of subjects. In this paper we show how the high-dimensionality poses additional challenges when dealing with class-imbalanced prediction. Here we present new simulation studies for five classifiers, where we expand our previous results to correlated variables, and briefly discuss the results.
引用
收藏
页码:123 / 126
页数:4
相关论文
共 50 条
  • [21] A Cost-Sensitive Sparse Representation Based Classification for Class-Imbalance Problem
    Liu, Zhenbing
    Gao, Chunyang
    Yang, Huihua
    He, Qijia
    SCIENTIFIC PROGRAMMING, 2016, 2016
  • [22] Using Ensembles for Class-Imbalance Problem to Predict Maintainability of Open Source Software
    Malhotra, Ruchika
    Lata, Kusum
    INTERNATIONAL JOURNAL OF RELIABILITY QUALITY AND SAFETY ENGINEERING, 2020, 27 (05)
  • [23] Ensemble learning via constraint projection and undersampling technique for class-imbalance problem
    Guo, Huaping
    Zhou, Jun
    Wu, Chang-An
    SOFT COMPUTING, 2020, 24 (07) : 4711 - 4727
  • [24] An Ensemble Learning-Based Undersampling Technique for Handling Class-Imbalance Problem
    Sarkar, Sobhan
    Khatedi, Nikhil
    Pramanik, Anima
    Maiti, J.
    PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 586 - 595
  • [25] Parameterized Clustering Cleaning Approach for High-Dimensional Datasets with Class Overlap and Imbalance
    Goel N.
    Singaravelu M.
    Gupta S.
    Namana S.
    Singh R.
    Kumar R.
    SN Computer Science, 4 (5)
  • [26] Adaptive Sampling with Optimal Cost for Class-Imbalance Learning
    Peng, Yuxin
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 2921 - 2927
  • [27] Generating Counterfactual Instances for Explainable Class-Imbalance Learning
    Chen, Zhi
    Duan, Jiang
    Kang, Li
    Xu, Hongyan
    Chen, Rui
    Qiu, Guoping
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (03) : 1130 - 1144
  • [28] Online Anomaly Detection via Class-Imbalance Learning
    Maurya, Chandresh Kumar
    Toshniwal, Durga
    Venkoparao, Gopalan Vijendran
    2015 EIGHTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2015, : 30 - 35
  • [29] Handling Class-Imbalance with KNN (Neighbourhood) Under-Sampling for Software Defect Prediction
    Goyal, Somya
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (03) : 2023 - 2064
  • [30] Effective Feature Selection Method for Class-Imbalance Datasets Applied to Chemical Toxicity Prediction
    Antelo-Collado, Aurelio
    Carrasco-Velar, Ramon
    Garcia-Pedrajas, Nicolas
    Cerruela-Garcia, Gonzalo
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (01) : 76 - 94