Active Batch Selection via Convex Relaxations with Guaranteed Solution Bounds

被引:47
作者
Chakraborty, Shayok [1 ]
Balasubramanian, Vineeth [2 ]
Sun, Qian [3 ]
Panchanathan, Sethuraman [4 ]
Ye, Jieping [3 ]
机构
[1] Carnegie Mellon Univ, Dept Elect & Comp Engn, Pittsburgh, PA 15213 USA
[2] Indian Inst Technol, Dept Comp Sci & Engn, Hyderabad, Andhra Pradesh, India
[3] Arizona State Univ, Dept Comp Sci & Engn, Tempe, AZ 85281 USA
[4] Arizona State Univ, Ctr Cognit Ubiquitous Comp CUbiC, Tempe, AZ 85281 USA
基金
美国国家科学基金会;
关键词
Batch mode active learning; optimization; QUERY; FACE;
D O I
10.1109/TPAMI.2015.2389848
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Active learning techniques have gained popularity to reduce human effort in labeling data instances for inducing a classifier. When faced with large amounts of unlabeled data, such algorithms automatically identify the exemplar instances for manual annotation. More recently, there have been attempts towards a batch mode form of active learning, where a batch of data points is simultaneously selected from an unlabeled set. In this paper, we propose two novel batch mode active learning (BMAL) algorithms: BatchRank and BatchRand. We first formulate the batch selection task as an NP-hard optimization problem; we then propose two convex relaxations, one based on linear programming and the other based on semi-definite programming to solve the batch selection problem. Finally, a deterministic bound is derived on the solution quality for the first relaxation and a probabilistic bound for the second. To the best of our knowledge, this is the first research effort to derive mathematical guarantees on the solution quality of the BMAL problem. Our extensive empirical studies on 15 binary, multi-class and multi-label challenging datasets corroborate that the proposed algorithms perform at par with the state-of-the-art techniques, deliver high quality solutions and are robust to real-world issues like label noise and class imbalance.
引用
收藏
页码:1945 / 1958
页数:14
相关论文
共 42 条
[1]  
[Anonymous], UCDCSI200901
[2]  
[Anonymous], 2009, P 26 ANN INT C MACH
[3]  
[Anonymous], 2010, P ACM SIGKDD
[4]  
[Anonymous], 2008, IEEE C COMP VIS PATT
[5]  
[Anonymous], 2006, WORLD WIDE WEB, DOI DOI 10.1145/1135777.1135870
[6]  
[Anonymous], 2010, Advances in neural information processing systems
[7]  
Balasubramanian Vineeth, 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, P1378, DOI 10.1109/ICCVW.2009.5457449
[8]   The true sample complexity of active learning [J].
Balcan, Maria-Florina ;
Hanneke, Steve ;
Vaughan, Jennifer Wortman .
MACHINE LEARNING, 2010, 80 (2-3) :111-139
[9]  
Baram Y, 2004, J MACH LEARN RES, V5, P255
[10]  
Bilgic L., 2010, Proceedings of the 27th International Conference on Machine Learning, P79