Cost-sensitive positive and unlabeled learning

被引:17
|
作者
Chen, Xiuhua [1 ]
Gong, Chen [1 ,2 ]
Yang, Jian [1 ,3 ]
机构
[1] Nanjing Univ Sci & Technol, Key Lab Intelligent Percept & Syst High Dimens In, Sch Comp Sci & Engn, PCA Lab,Minist Educ, Nanjing, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[3] Jiangsu Key Lab Image & Video Understanding Socia, Minist Educ, Peoples R China
关键词
Positive and Unlabeled learning (PU learning); Class imbalance; Cost-sensitive learning; Generalization bound; SMOTE;
D O I
10.1016/j.ins.2021.01.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Positive and Unlabeled learning (PU learning) aims to train a binary classifier solely based on positively labeled and unlabeled data when negatively labeled data are absent or distributed too diversely. However, none of the existing PU learning methods takes the class imbalance problem into account, which significantly neglects the minority class and is likely to generate a biased classifier. Therefore, this paper proposes a novel algorithm termed "Cost-Sensitive Positive and Unlabeled learning" (CSPU) which imposes different misclassification costs on different classes when conducting PU classification. Specifically, we assign distinct weights to the losses caused by false negative and false positive examples, and employ double hinge loss to build our CSPU algorithm under the framework of empirical risk minimization. Theoretically, we analyze the computational complexity, and also derive a generalization error bound of CSPU which guarantees the good performance of our algorithm on test data. Empirically, we compare CSPU with the state-of-the-art PU learning methods on synthetic dataset, OpenML benchmark datasets, and real-world datasets. The results clearly demonstrate the superiority of the proposed CSPU to other comparators in dealing with class imbalanced tasks. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:229 / 245
页数:17
相关论文
共 50 条
  • [31] An empirical study of cost-sensitive learning in cultural modeling
    Peng Su
    Wenji Mao
    Daniel Zeng
    Information Systems and e-Business Management, 2013, 11 : 437 - 455
  • [32] An empirical study of cost-sensitive learning in cultural modeling
    Su, Peng
    Mao, Wenji
    Zeng, Daniel
    INFORMATION SYSTEMS AND E-BUSINESS MANAGEMENT, 2013, 11 (03) : 437 - 455
  • [33] A weighted rough set approach for cost-sensitive learning
    Liu, Jinfu
    Yu, Daren
    ROUGH SETS, FUZZY SETS, DATA MINING AND GRANULAR COMPUTING, PROCEEDINGS, 2007, 4482 : 355 - +
  • [34] Adaptive learning cost-sensitive convolutional neural network
    Hou, Yun
    Fan, Hong
    Li, Li
    Li, Bailin
    IET COMPUTER VISION, 2021, 15 (05) : 346 - 355
  • [35] Cost-sensitive learning using logical analysis of data
    Osman, Hany
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (06) : 3571 - 3606
  • [36] Cost-sensitive support vector machines
    Iranmehr, Arya
    Masnadi-Shirazi, Hamed
    Vasconcelos, Nuno
    NEUROCOMPUTING, 2019, 343 : 50 - 64
  • [37] Learning From Weights: Cost-Sensitive Approach For Retrieval
    Begwani, Nikit
    Harsola, Shrutendra
    Agrawal, Rahul
    PROCEEDINGS OF THE 7TH ACM IKDD CODS AND 25TH COMAD (CODS-COMAD 2020), 2020, : 170 - 174
  • [38] Cost-sensitive learning for imbalanced medical data: a review
    Imane Araf
    Ali Idri
    Ikram Chairi
    Artificial Intelligence Review, 57
  • [39] Efficient Utilization of Missing Data in Cost-Sensitive Learning
    Zhu, Xiaofeng
    Yang, Jianye
    Zhang, Chengyuan
    Zhang, Shichao
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2425 - 2436
  • [40] Cost-sensitive matrixized classification learning with information entropy
    Wang, Zhe
    Chu, Xu
    Li, Dongdong
    Yang, Hai
    Qu, Weichao
    APPLIED SOFT COMPUTING, 2022, 116