The class-imbalance problem for high-dimensional class prediction

被引:7
|
作者
Lusa, Lara [1 ]
Blagus, Rok [1 ]
机构
[1] Univ Ljubljana, Inst Biostat & Med Informat, Ljubljana, Slovenia
来源
2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2 | 2012年
关键词
class-imbalance; high -dimensional data; classification;
D O I
10.1109/ICMLA.2012.223
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The goal of class prediction studies is to develop rules to accurately predict the class membership of new subjects. The classifiers differ in the way they combine the values of the variables available for each subject. Frequently the classifiers are developed using class-imbalanced data, where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced data are often biased towards the majority class: they classify most new samples in the majority class and they do not accurately predict the minority class. Data are high-dimensional when the number of variables greatly exceeds the number of subjects. In this paper we show how the high-dimensionality poses additional challenges when dealing with class-imbalanced prediction. Here we present new simulation studies for five classifiers, where we expand our previous results to correlated variables, and briefly discuss the results.
引用
收藏
页码:123 / 126
页数:4
相关论文
共 50 条
  • [31] Class-overlap undersampling based on Schur decomposition for Class-imbalance problems
    Dai, Qi
    Liu, Jian-wei
    Shi, Yong-hui
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 221
  • [32] An Empirical Investigation to Overcome Class-imbalance in Inspection Reviews
    Singh, Maninder
    Walia, Gursimran S.
    Goswami, Anurag
    2017 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA SCIENCE (MLDS 2017), 2017, : 15 - 22
  • [33] Using SMOTE to Deal with Class-Imbalance Problem in Bioactivity Data to Predict mTOR Inhibitors
    Kumari C.
    Abulaish M.
    Subbarao N.
    SN Computer Science, 2020, 1 (3)
  • [34] Distributed Sparse Class-Imbalance Learning and Its Applications
    Maurya, Chandresh Kumar
    Toshniwal, Durga
    Venkoparao, Gopalan Vijendran
    IEEE TRANSACTIONS ON BIG DATA, 2021, 7 (05) : 832 - 844
  • [35] Exploratory under-sampling for class-imbalance learning
    Liu, Xu-Ying
    Wu, Jianxin
    Zhou, Zhi-Hua
    ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 965 - 969
  • [36] Fault Level Prediction Method for Urban Distribution Network Considering Class-Imbalance Problems
    Zhang, Wen
    Zhang, Xinzhe
    Mou, Huawei
    Fu, Shenghui
    2023 IEEE/IAS INDUSTRIAL AND COMMERCIAL POWER SYSTEM ASIA, I&CPS ASIA, 2023, : 2065 - 2069
  • [37] A Method for Class-Imbalance Learning in Android Malware Detection
    Guan, Jun
    Jiang, Xu
    Mao, Baolei
    ELECTRONICS, 2021, 10 (24)
  • [38] Handling Class-Imbalance with KNN (Neighbourhood) Under-Sampling for Software Defect Prediction
    Somya Goyal
    Artificial Intelligence Review, 2022, 55 : 2023 - 2064
  • [39] The class imbalance problem
    Megahed, Fadel M.
    Chen, Ying-Ju
    Megahed, Aly
    Ong, Yuya
    Altman, Naomi
    Krzywinski, Martin
    NATURE METHODS, 2021, 18 (11) : 1270 - 1272
  • [40] On the Class Imbalance Problem
    Guo, Xinjian
    Yin, Yilong
    Dong, Cailing
    Yang, Gongping
    Zhou, Guangtong
    ICNC 2008: FOURTH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 4, PROCEEDINGS, 2008, : 192 - 201