A Learning Objective Controllable Sphere-Based Method for Balanced and Imbalanced Data Classification

被引:0
作者
Park, Yeontark [1 ]
Lee, Jong-Seok [1 ]
机构
[1] Sungkyunkwan Univ, Dept Ind Engn, Suwon 16419, South Korea
基金
新加坡国家研究基金会;
关键词
Prototypes; Costs; Training; Performance evaluation; Machine learning; Task analysis; Support vector machines; Classification; class imbalance; sphere covering; learning objective; area under ROC curve; SMOTE; ALGORITHMS;
D O I
10.1109/ACCESS.2021.3130272
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Imbalanced data classification is one of the most important tasks in the field of machine learning because abnormality, which is usually of our interest, appears less frequently than normality in real-world systems. Learning classifiers from imbalanced data can be troublesome due to no absolute standard as to how much imbalance can be said to be imbalanced or balanced. To address this issue, this research proposes a new sphere-based classification method named LOCS (learning objective controllable sphere-based classifier), which is designed to maximize AUC (area under ROC curve). The AUC learning objective was adopted from the fact that it approximates the accuracy as class distribution becomes balanced. Therefore, the proposed method properly performs a classification task for both imbalanced and balanced data. It constructs a classification model by a single training, whereas existing cost-sensitive learning and resampling methods usually attempt different parameter settings. In addition, the learning objective can be easily modified within LOCS for each of application domains by setting different importance levels for positive and negative classes, respectively. Numerical experiments based on 25 real datasets with several investigational settings showed the effectiveness and the intended strengths of the proposed method.
引用
收藏
页码:158010 / 158026
页数:17
相关论文
共 50 条
  • [1] Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
  • [2] Investigating the Performance of an Order Imbalance based Trading Strategy in a High-Frequency Trading
    Amir, Gholami
    Masoud, Eftekharzadeh Maraghi
    [J]. INDUSTRIAL ENGINEERING AND MANAGEMENT SYSTEMS, 2020, 19 (01): : 174 - 183
  • [3] [Anonymous], 2003, ICML 2003 WORKSH LEA
  • [4] [Anonymous], 2013, J Bionanosci, DOI DOI 10.1166/JBNS.2013.1162
  • [5] [Anonymous], INT JOINT C ART INT
  • [6] MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Yao, Xin
    Murase, Kazuyuki
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 405 - 425
  • [7] PROTOTYPE SELECTION FOR INTERPRETABLE CLASSIFICATION
    Bien, Jacob
    Tibshirani, Robert
    [J]. ANNALS OF APPLIED STATISTICS, 2011, 5 (04) : 2403 - 2424
  • [8] Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
  • [9] Breiman L., 1984, CLASSIFICATION REGRE, V37, P237, DOI [10.1201/9781315139470-8, DOI 10.1201/9781315139470-8, DOI 10.1201/9781315139470]
  • [10] Bunkhumpornpat C, 2009, LECT NOTES ARTIF INT, V5476, P475, DOI 10.1007/978-3-642-01307-2_43