Cost-sensitive positive and unlabeled learning

被引:17
|
作者
Chen, Xiuhua [1 ]
Gong, Chen [1 ,2 ]
Yang, Jian [1 ,3 ]
机构
[1] Nanjing Univ Sci & Technol, Key Lab Intelligent Percept & Syst High Dimens In, Sch Comp Sci & Engn, PCA Lab,Minist Educ, Nanjing, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[3] Jiangsu Key Lab Image & Video Understanding Socia, Minist Educ, Peoples R China
关键词
Positive and Unlabeled learning (PU learning); Class imbalance; Cost-sensitive learning; Generalization bound; SMOTE;
D O I
10.1016/j.ins.2021.01.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Positive and Unlabeled learning (PU learning) aims to train a binary classifier solely based on positively labeled and unlabeled data when negatively labeled data are absent or distributed too diversely. However, none of the existing PU learning methods takes the class imbalance problem into account, which significantly neglects the minority class and is likely to generate a biased classifier. Therefore, this paper proposes a novel algorithm termed "Cost-Sensitive Positive and Unlabeled learning" (CSPU) which imposes different misclassification costs on different classes when conducting PU classification. Specifically, we assign distinct weights to the losses caused by false negative and false positive examples, and employ double hinge loss to build our CSPU algorithm under the framework of empirical risk minimization. Theoretically, we analyze the computational complexity, and also derive a generalization error bound of CSPU which guarantees the good performance of our algorithm on test data. Empirically, we compare CSPU with the state-of-the-art PU learning methods on synthetic dataset, OpenML benchmark datasets, and real-world datasets. The results clearly demonstrate the superiority of the proposed CSPU to other comparators in dealing with class imbalanced tasks. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:229 / 245
页数:17
相关论文
共 50 条
  • [41] Influence of class distribution on cost-sensitive learning: A case study of bankruptcy analysis
    Chen, Ning
    Chen, An
    Ribeiro, Bernardete
    INTELLIGENT DATA ANALYSIS, 2013, 17 (03) : 423 - 437
  • [42] Multi-view cost-sensitive kernel learning for imbalanced classification problem
    Tang, Jingjing
    Hou, Zhaojie
    Yu, Xiaotong
    Fu, Saiji
    Tian, Yingjie
    NEUROCOMPUTING, 2023, 552
  • [43] A Cost-Sensitive Deep Learning-Based Approach for Network Traffic Classification
    Telikani, Akbar
    Gandomi, Amir H.
    Choo, Kim-Kwang Raymond
    Shen, Jun
    IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2022, 19 (01): : 661 - 670
  • [44] Cost-Sensitive Broad Learning System for Imbalanced Classification and Its Medical Application
    Yao, Liang
    Wong, Pak Kin
    Zhao, Baoliang
    Wang, Ziwen
    Lei, Long
    Wang, Xiaozheng
    Hu, Ying
    MATHEMATICS, 2022, 10 (05)
  • [45] Cost-Sensitive Learning of Fuzzy Rules for Imbalanced Classification Problems Using FURIA
    Palacios, Ana
    Trawinski, Krzysztof
    Cordon, Oscar
    Sanchez, Luciano
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2014, 22 (05) : 643 - 675
  • [46] Neighbor cleaning learning based cost-sensitive ensemble learning approach for software defect prediction
    Li, Li
    Su, Renjia
    Zhao, Xin
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (12)
  • [47] Cost-sensitive tree SHAP for explaining cost-sensitive tree-based models
    Kopanja, Marija
    Hacko, Stefan
    Brdar, Sanja
    Savic, Milos
    COMPUTATIONAL INTELLIGENCE, 2024, 40 (03)
  • [48] Cost-sensitive learning for social network analysis and network measurement
    Zhang Xing
    Wang, Meili
    Zhang Yang
    Ning Jifeng
    COMPUTERS & ELECTRICAL ENGINEERING, 2017, 61 : 67 - 79
  • [49] COST-SENSITIVE LEARNING OF CLASSIFICATION KNOWLEDGE AND ITS APPLICATIONS IN ROBOTICS
    TAN, M
    MACHINE LEARNING, 1993, 13 (01) : 7 - 33
  • [50] Cost-sensitive Fuzzy Multiple Kernel Learning for imbalanced problem
    Wang, Zhe
    Wang, Bolu
    Cheng, Yang
    Li, Dongdong
    Zhang, Jing
    NEUROCOMPUTING, 2019, 366 : 178 - 193