Online Adaptive Asymmetric Active Learning With Limited Budgets

被引:14
作者
Zhang, Yifan [1 ]
Zhao, Peilin [3 ]
Niu, Shuaicheng [2 ]
Wu, Qingyao [1 ]
Cao, Jiezhang [1 ]
Huang, Junzhou [3 ]
Tan, Mingkui [1 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou 510641, Guangdong, Peoples R China
[2] South China Univ Technol, Guangzhou 510641, Guangdong, Peoples R China
[3] Tencent AI Lab, Shenzhen 518172, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Optimization; Indexes; Adaptation models; Manganese; Sensitivity; Correlation; Active learning; online learning; class imbalance; budgeted query; sketching learning;
D O I
10.1109/TKDE.2019.2955078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online Active Learning (OAL) aims to manage unlabeled datastream by selectively querying the label of data. OAL is applicable to many real-world problems, such as anomaly detection in health-care and finance. In these problems, there are two key challenges: the query budget is often limited; the ratio between classes is highly imbalanced. In practice, it is quite difficult to handle imbalanced unlabeled datastream when only a limited budget of labels can be queried for training. To solve this, previous OAL studies adopt either asymmetric losses or queries (an isolated asymmetric strategy) to tackle the imbalance, and use first-order methods to optimize the cost-sensitive measure. However, the isolated strategy limits their performance in class imbalance, while first-order methods restrict their optimization performance. In this article, we propose a novel Online Adaptive Asymmetric Active learning algorithm, based on a new asymmetric strategy (merging both asymmetric losses and queries strategies), and second-order optimization. We theoretically analyze its mistake bound and cost-sensitive metric bounds. Moreover, to better balance performance and efficiency, we enhance our algorithm via a sketching technique, which significantly accelerates the computational speed with quite slight performance degradation. Promising results demonstrate the effectiveness and efficiency of the proposed methods.
引用
收藏
页码:2680 / 2692
页数:13
相关论文
共 33 条
  • [1] Abe N., 2006, P 12 ACM SIGKDD INT, P504, DOI DOI 10.1145/1150402.1150459
  • [2] [Anonymous], P INT C NEUR INF PRO, DOI DOI 10.1080/01621459.1963
  • [3] [Anonymous], 2008, P 25 INT C MACH LEAR, DOI DOI 10.1145/1390156.1390190
  • [4] [Anonymous], 2010, P 16 ACM SIGKDD INT
  • [5] [Anonymous], 2003, P 20 INT C MACH ICML
  • [6] Second-order perceptron algorithm
    Cesa-Bianchi, N
    Conconi, A
    Gentile, C
    [J]. SIAM JOURNAL ON COMPUTING, 2005, 34 (03) : 640 - 668
  • [7] Cesa-Bianchi N, 2006, J MACH LEARN RES, V7, P1205
  • [8] Crammer K, 2006, J MACH LEARN RES, V7, P551
  • [9] Adaptive regularization of weight vectors
    Crammer, Koby
    Kulesza, Alex
    Dredze, Mark
    [J]. MACHINE LEARNING, 2013, 91 (02) : 155 - 187
  • [10] Crammer Koby., 2009, Advances in Neural Information Processing Systems, P345