Active Sample Selection Through Sparse Neighborhood for Imbalanced Datasets

被引:2
作者
Gu, Ping [1 ]
Ling, Zhao [1 ]
Shao, Si Yu [1 ]
Zhou, Meng [1 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
来源
2019 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC) | 2019年
基金
中国国家自然科学基金;
关键词
Active Learning; Imbalance Learning; Sample Selection; Classification; SMOTE;
D O I
10.1109/iscc47284.2019.8969713
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For the imbalanced datasets, a novel biased based active sampling learning algorithm is proposed for the first time. The algorithm combines two important sampling factors with minority confidence and instances' informativeness in active learning framework. The sampling strategy aims at taking into account the selected instances' utilities while avoiding sampling invalid majority instances. For this purpose, a novel label propagation algorithm through sparse neighborhood independent of super-parameter k is proposed to calculate minority confidence. Different from other semisupervised learning methods, the algorithm learns the instances by sparse coding theory and adaptively constructs the sparse neighborhood and the sparse neighborhood graph. For calculating instances' informativeness, we propose an informativeness measure method based on the nearest boundary distance. It mainly utilizes direction vector feature and a heuristic search strategy to construct an auxiliary decision boundary. Then we evaluate the instances' informativeness based on the auxiliary decision boundary.
引用
收藏
页码:112 / 117
页数:6
相关论文
共 18 条
  • [1] MWMOTE-Majority Weighted Minority Oversampling Technique for Imbalanced Data Set Learning
    Barua, Sukarna
    Islam, Md. Monirul
    Yao, Xin
    Murase, Kazuyuki
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (02) : 405 - 425
  • [2] IMCStacking: Cost-sensitive stacking learning with feature inverse mapping for imbalanced problems
    Cao, Chenjie
    Wang, Zhe
    [J]. KNOWLEDGE-BASED SYSTEMS, 2018, 150 : 27 - 37
  • [3] SMOTE: Synthetic minority over-sampling technique
    Chawla, Nitesh V.
    Bowyer, Kevin W.
    Hall, Lawrence O.
    Kegelmeyer, W. Philip
    [J]. 2002, American Association for Artificial Intelligence (16)
  • [4] Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy
    Dal Pozzolo, Andrea
    Boracchi, Giacomo
    Caelen, Olivier
    Alippi, Cesare
    Bontempi, Gianluca
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) : 3784 - 3797
  • [5] Ertekin Seyda, 2007, INT C INF KNOWL MANA, P127, DOI DOI 10.1145/1321440.1321461
  • [6] Certainty-based active learning for sampling imbalanced datasets
    Fu, JuiHsi
    Lee, SingLing
    [J]. NEUROCOMPUTING, 2013, 119 : 350 - 358
  • [7] Georgios D, 2018, INFORM SCI
  • [8] Borderline-SMOTE: A new over-sampling method in imbalanced data sets learning
    Han, H
    Wang, WY
    Mao, BH
    [J]. ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 878 - 887
  • [9] Learning from Imbalanced Data
    He, Haibo
    Garcia, Edwardo A.
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (09) : 1263 - 1284
  • [10] Jayasinghe L, 2019, IEEE ACCESS, P1