An imbalanced binary classification method via space mapping using normalizing flows with class discrepancy constraints

被引:2
|
作者
Huang, Zijian [1 ]
Gao, Xin [1 ]
Chen, Wenli [2 ]
Cheng, Yingying [2 ]
Xue, Bing [1 ]
Meng, Zhihang [1 ]
Zhang, Guangyao [1 ]
Fu, Shiyuan [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing 100876, Peoples R China
[2] State Grid Chongqing Elect Power Co, Mkt Serv Ctr, Chongqing 400014, Peoples R China
关键词
Imbalanced binary classification; Class overlaps; The flow-based model; Class discrepancy constraints; SMOTE; ALGORITHM;
D O I
10.1016/j.ins.2022.12.029
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For imbalanced binary classification, most methods classify data in the original space by balancing data or modifying classification algorithms. However, for datasets with severe overlaps and complex class boundaries, the classification performance is limited by the ability of the classifier to fit the decision boundary. A practical alternative is to map the data to a latent space with lower classification difficulty. In this paper, we propose an imbalanced binary classification method via space mapping using normalizing flows with class discrepancy constraints. The flow-based model is employed to map the original data to a latent space, ensuring the accuracy of the mapping and the authenticity of the mapped distribution. To solve the problem of overlaps and within-class imbalance, class discrep-ancy constraints, including global and local constraints, are proposed to modify the flow -based model. The former maps different subclusters to the same cluster of the latent space, and the latter increases the separability of different classes. Classification thus can be car-ried out in the latent space with simpler distribution and high separability to achieve bet-ter classification performance. Experimental results on 35 KEEL and UCI public datasets indicate that the proposed method outperforms other state-of-the-art methods on F1 -measure and G-mean. (c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:493 / 523
页数:31
相关论文
共 10 条
  • [1] Binary classification for imbalanced datasets using twin hyperspheres based on conformal method
    Zheng, Jian
    Li, Lin
    Wang, Shiyan
    Yan, Huyong
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 11299 - 11315
  • [2] Addressing Binary Classification over Class Imbalanced Clinical Datasets Using Computationally Intelligent Techniques
    Kumar, Vinod
    Lalotra, Gotam Singh
    Sasikala, Ponnusamy
    Rajput, Dharmendra Singh
    Kaluri, Rajesh
    Lakshmanna, Kuruva
    Shorfuzzaman, Mohammad
    Alsufyani, Abdulmajeed
    Uddin, Mueen
    HEALTHCARE, 2022, 10 (07)
  • [3] Class-Imbalanced Voice Pathology Detection and Classification Using Fuzzy Cluster Oversampling Method
    Fan, Ziqi
    Wu, Yuanbo
    Zhou, Changwei
    Zhang, Xiaojun
    Tao, Zhi
    APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [4] Reduced-Kernel Weighted Extreme Learning Machine Using Universum Data in Feature Space (RKWELM-UFS) to Handle Binary Class Imbalanced Dataset Classification
    Choudhary, Roshani
    Shukla, Sanyam
    SYMMETRY-BASEL, 2022, 14 (02):
  • [5] Mapping US Urban Extents from MODIS Data Using One-Class Classification Method
    Wan, Bo
    Guo, Qinghua
    Fang, Fang
    Su, Yanjun
    Wang, Run
    REMOTE SENSING, 2015, 7 (08) : 10143 - 10163
  • [6] Multi-class Motor Imagery Classification using Multi-class SVM with Multi-band Riemannian Tangent Space Mapping
    Shin, Jinhyo
    Chung, Wonzoo
    2023 11TH INTERNATIONAL WINTER CONFERENCE ON BRAIN-COMPUTER INTERFACE, BCI, 2023,
  • [7] An imbalanced binary classification method based on contrastive learning using multi-label confidence comparisons within sample- neighbors pair
    Gao, Xin
    Meng, Zhihang
    Jia, Xin
    Liu, Jing
    Diao, Xinping
    Xue, Bing
    Huang, Zijian
    Li, Kangsheng
    NEUROCOMPUTING, 2023, 517 : 148 - 164
  • [8] StainView: A Fast and Reliable Method for Mapping Stains in Facades Using Image Classification in HSV and CIELab Colour Space
    Torres-Gonzales, Marta
    Valenca, Jonatas
    Santos, Bruno O.
    Silva, Ana
    Mendes, Maria P.
    REMOTE SENSING, 2023, 15 (11)
  • [9] Cost-Sensitive Support Vector Machine Using Randomized Dual Coordinate Descent Method for Big Class-Imbalanced Data Classification
    Tang, Mingzhu
    Yang, Chunhua
    Zhang, Kang
    Xie, Qiyue
    ABSTRACT AND APPLIED ANALYSIS, 2014,
  • [10] Fuzzy Rule-Based Classification Systems for multi-class problems using binary decomposition strategies: On the influence of n-dimensional overlap functions in the Fuzzy Reasoning Method
    Elkano, Mikel
    Galar, Mikel
    Sanz, Jose
    Bustince, Humberto
    INFORMATION SCIENCES, 2016, 332 : 94 - 114