Online Adaptive Asymmetric Active Learning With Limited Budgets

被引：15

作者：

Zhang, Yifan ^{[1
]}

Zhao, Peilin ^{[3
]}

Niu, Shuaicheng ^{[2
]}

Wu, Qingyao ^{[1
]}

Cao, Jiezhang ^{[1
]}

Huang, Junzhou ^{[3
]}

Tan, Mingkui ^{[1
]}

机构：

[1] South China Univ Technol, Sch Software Engn, Guangzhou 510641, Guangdong, Peoples R China

[2] South China Univ Technol, Guangzhou 510641, Guangdong, Peoples R China

[3] Tencent AI Lab, Shenzhen 518172, Guangdong, Peoples R China

来源：

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING | 2021年 / 33卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Optimization; Indexes; Adaptation models; Manganese; Sensitivity; Correlation; Active learning; online learning; class imbalance; budgeted query; sketching learning;

D O I：

10.1109/TKDE.2019.2955078

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Online Active Learning (OAL) aims to manage unlabeled datastream by selectively querying the label of data. OAL is applicable to many real-world problems, such as anomaly detection in health-care and finance. In these problems, there are two key challenges: the query budget is often limited; the ratio between classes is highly imbalanced. In practice, it is quite difficult to handle imbalanced unlabeled datastream when only a limited budget of labels can be queried for training. To solve this, previous OAL studies adopt either asymmetric losses or queries (an isolated asymmetric strategy) to tackle the imbalance, and use first-order methods to optimize the cost-sensitive measure. However, the isolated strategy limits their performance in class imbalance, while first-order methods restrict their optimization performance. In this article, we propose a novel Online Adaptive Asymmetric Active learning algorithm, based on a new asymmetric strategy (merging both asymmetric losses and queries strategies), and second-order optimization. We theoretically analyze its mistake bound and cost-sensitive metric bounds. Moreover, to better balance performance and efficiency, we enhance our algorithm via a sketching technique, which significantly accelerates the computational speed with quite slight performance degradation. Promising results demonstrate the effectiveness and efficiency of the proposed methods.

引用

页码：2680 / 2692

页数：13

共 33 条

[1]

Abe N., 2006, P 12 ACM SIGKDD INT, P504

[2]

[Anonymous], 2014, Adv. Neural Inf. Process. Syst. (NeurIPS), DOI DOI 10.1080/01621459.1963

[3]

[Anonymous], 2008, P 25 INT C MACHINE L, DOI DOI 10.1145/1390156.1390190

[4]

[Anonymous], 2010, P 16 ACM SIGKDD INT

[5] Second-order perceptron algorithm [J].

Cesa-Bianchi, N ;

Conconi, A ;

Gentile, C .

SIAM JOURNAL ON COMPUTING, 2005, 34 (03) :640-668

[6]

Cesa-Bianchi N, 2006, J MACH LEARN RES, V7, P1205

[7]

Crammer K, 2006, J MACH LEARN RES, V7, P551

[8] Adaptive regularization of weight vectors [J].

Crammer, Koby ;

Kulesza, Alex ;

Dredze, Mark .

MACHINE LEARNING, 2013, 91 (02) :155-187

[9]

Crammer Koby., 2009, Advances in Neural Information Processing Systems, P345

[10]

Dundar M, 2007, 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P756

← 1 2 3 4 →