The class-imbalance problem for high-dimensional class prediction

被引:7
|
作者
Lusa, Lara [1 ]
Blagus, Rok [1 ]
机构
[1] Univ Ljubljana, Inst Biostat & Med Informat, Ljubljana, Slovenia
来源
2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2 | 2012年
关键词
class-imbalance; high -dimensional data; classification;
D O I
10.1109/ICMLA.2012.223
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of class prediction studies is to develop rules to accurately predict the class membership of new subjects. The classifiers differ in the way they combine the values of the variables available for each subject. Frequently the classifiers are developed using class-imbalanced data, where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced data are often biased towards the majority class: they classify most new samples in the majority class and they do not accurately predict the minority class. Data are high-dimensional when the number of variables greatly exceeds the number of subjects. In this paper we show how the high-dimensionality poses additional challenges when dealing with class-imbalanced prediction. Here we present new simulation studies for five classifiers, where we expand our previous results to correlated variables, and briefly discuss the results.
引用
收藏
页码:123 / 126
页数:4
相关论文
共 50 条
  • [1] On Chance Performance in High-Dimensional Class-Imbalance Problems
    Udu, Amadi Gabriel
    Lecchini-Visintini, Andrea
    Dong, Hongbiao
    2024 UKACC 14TH INTERNATIONAL CONFERENCE ON CONTROL, CONTROL, 2024, : 254 - 255
  • [2] Bayes Vector Quantizer for Class-Imbalance Problem
    Diamantini, Claudia
    Potena, Domenico
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (05) : 638 - 651
  • [3] Handling class imbalance in high-dimensional biomedical datasets
    Pes, Barbara
    2019 IEEE 28TH INTERNATIONAL CONFERENCE ON ENABLING TECHNOLOGIES: INFRASTRUCTURE FOR COLLABORATIVE ENTERPRISES (WETICE), 2019, : 150 - 155
  • [4] Class prediction for high-dimensional class-imbalanced data
    Rok Blagus
    Lara Lusa
    BMC Bioinformatics, 11
  • [5] Class prediction for high-dimensional class-imbalanced data
    Blagus, Rok
    Lusa, Lara
    BMC BIOINFORMATICS, 2010, 11 : 523
  • [6] Towards Mitigating the Class-Imbalance Problem for Partial Label Learning
    Wang, Jing
    Zhang, Min-Ling
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 2427 - 2436
  • [7] COMBATING CLASS-IMBALANCE AND OUTLIERS IN GESTATIONAL DIABETES MELLITUS PREDICTION
    Jinlan, Guan
    Guanghui, Fu
    Jiequan, Ou
    Tingting, Wang
    ACTA MEDICA MEDITERRANEA, 2022, 38 (02): : 1167 - 1174
  • [8] High Class-Imbalance in pre-miRNA Prediction: A Novel Approach Based on deepSOM
    Stegmayer, Georgina
    Yones, Cristian
    Kamenetzky, Laura
    Milone, Diego H.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (06) : 1316 - 1326
  • [9] Exploratory Undersampling for Class-Imbalance Learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2009, 39 (02): : 539 - 550
  • [10] Measuring the class-imbalance extent of multi-class problems
    Ortigosa-Hernandez, Jonathan
    Inza, Inaki
    Lozano, Jose A.
    PATTERN RECOGNITION LETTERS, 2017, 98 : 32 - 38