Robust and Discriminative Labeling for Multi-Label Active Learning Based on Maximum Correntropy Criterion

被引:86
作者
Du, Bo [1 ,2 ]
Wang, Zengmao [1 ]
Zhang, Lefei [1 ]
Zhang, Liangpei [3 ]
Tao, Dacheng [4 ]
机构
[1] Wuhan Univ, Sch Comp, Wuhan 430079, Peoples R China
[2] Univ Technol Sydney, FEIT, Sydney, NSW 2007, Australia
[3] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430072, Peoples R China
[4] Univ Sydney, Fac Engn & Informat Technol, Sch Informat Technol, Darlington, NSW 2008, Australia
基金
中国国家自然科学基金; 澳大利亚研究理事会;
关键词
Active learning; multi-label learning; multi-label classification; DRIVEN;
D O I
10.1109/TIP.2017.2651372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-label learning draws great interests in many real world applications. It is a highly costly task to assign many labels by the oracle for one instance. Meanwhile, it is also hard to build a good model without diagnosing discriminative labels. Can we reduce the label costs and improve the ability to train a good model for multi-label learning simultaneously? Active learning addresses the less training samples problem by querying the most valuable samples to achieve a better performance with little costs. In multi-label active learning, some researches have been done for querying the relevant labels with less training samples or querying all labels without diagnosing the discriminative information. They all cannot effectively handle the outlier labels for the measurement of uncertainty. Since maximum correntropy criterion (MCC) provides a robust analysis for outliers in many machine learning and data mining algorithms, in this paper, we derive a robust multi-label active learning algorithm based on an MCC by merging uncertainty and representativeness, and propose an efficient alternating optimization method to solve it. With MCC, our method can eliminate the influence of outlier labels that are not discriminative to measure the uncertainty. To make further improvement on the ability of information measurement, we merge uncertainty and representativeness with the prediction labels of unknown data. It cannot only enhance the uncertainty but also improve the similarity measurement of multi-label data with labels information. Experiments on benchmark multi-label data sets have shown a superior performance than the state-of-the- art methods.
引用
收藏
页码:1694 / 1707
页数:14
相关论文
共 44 条
  • [1] [Anonymous], 2008, Advances in Neural Information Processing Systems
  • [2] [Anonymous], 2008, IEEE C COMP VIS PATT
  • [3] [Anonymous], 2015, ACM T KNOWL DISCOV D, DOI DOI 10.1145/2700408
  • [4] [Anonymous], 2008, 2008 IEEE C COMPUTER, DOI DOI 10.1109/CVPR.2008.4587383
  • [5] [Anonymous], 2009, 1648 U WISC MAD DEP
  • [6] [Anonymous], 2011, P 24 INT C NEUR INF
  • [7] [Anonymous], 2010, Advances in neural information processing systems
  • [8] [Anonymous], 2002, P 8 ACM SIGKDD INT C, DOI DOI 10.1145/775047.775087
  • [9] Bezdek J. C., 2003, Neural, Parallel & Scientific Computations, V11, P351
  • [10] Distributed optimization and statistical learning via the alternating direction method of multipliers
    Boyd S.
    Parikh N.
    Chu E.
    Peleato B.
    Eckstein J.
    [J]. Foundations and Trends in Machine Learning, 2010, 3 (01): : 1 - 122