Learning with Inadequate and Incorrect Supervision

被引:36
作者
Gong, Chen [1 ]
Zhang, Hengmin [1 ]
Yang, Jian [1 ]
Tao, Dacheng [2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Jiangsu, Peoples R China
[2] Univ Sydney, FEIT, SIT, UBTECH Sydney Ctr, Sydney, NSW, Australia
来源
2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2017年
基金
澳大利亚研究理事会;
关键词
semi-supervised learning; label noise; graph trend filtering; smooth eigenbase pursuit; FRAMEWORK;
D O I
10.1109/ICDM.2017.110
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Practically, we are often in the dilemma that the labeled data at hand are inadequate to train a reliable classifier, and more seriously, some of these labeled data may be mistakenly labeled (Inc to the various human factors. Therefore, this paper proposes a novel semi-supervised learning paradigm that can handle both label insufficiency and label inaccuracy-. To address label insufficiency, we use a graph to bridge the data points so that the label information can be propagated from the scarce labeled examples to unlabeled examples along the graph edges. To address label inaccuracy, Graph Trend Filtering (GTE) and Smooth Eigenbase Pursuit (SEP) are adopted to filter out the initial noisy labels. GTF penalizes the eo norm of label difference between connected examples in the graph and exhibits better local adaptivity than the traditional I:, norm-based Laplacian smoother. SEP reconstructs the correct labels by emphasizing the leading eigenvectors of Laplacian matrix associated with small eigenvalues, as these eigenvectors reflect real label smoothness and carry rich class separation cues. We term our algorithm as "Semi-supervised learning under Inadequate and Incorrect Supervision" (SITS). Thorough experimental results on image classification, text categorization, and speech recognition demonstrate that our SITS is effective in label error correction, leading to superior performance to the state-of-the-art methods in the presence of label noise and label scarcity.
引用
收藏
页码:889 / 894
页数:6
相关论文
共 30 条
[1]   Introduction to semi-supervised learning [J].
Goldberg, Xiaojin .
Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 6 :1-116
[2]  
Belkin M, 2006, J MACH LEARN RES, V7, P2399
[3]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[4]   Manifold Adaptive Experimental Design for Text Categorization [J].
Cai, Deng ;
He, Xiaofei .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (04) :707-719
[5]   Semantic Pooling for Complex Event Analysis in Untrimmed Videos [J].
Chang, Xiaojun ;
Yu, Yao-Liang ;
Yang, Yi ;
Xing, Eric P. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (08) :1617-1632
[6]   Searching Persuasively: Joint Event Detection and Evidence Recounting with Limited Supervision [J].
Chang, Xiaojun ;
Yu, Yao-Liang ;
Yang, Yi ;
Hauptmann, Alexander G. .
MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, :581-590
[7]   ON THE DOUGLAS-RACHFORD SPLITTING METHOD AND THE PROXIMAL POINT ALGORITHM FOR MAXIMAL MONOTONE-OPERATORS [J].
ECKSTEIN, J ;
BERTSEKAS, DP .
MATHEMATICAL PROGRAMMING, 1992, 55 (03) :293-318
[8]  
Fergus Rob, 2009, Advances in Neural Information Processing Systems, V22, P522
[9]  
Gao W, 2016, AAAI CONF ARTIF INTE, P1575
[10]   Deformed Graph Laplacian for Semisupervised Learning [J].
Gong, Chen ;
Liu, Tongliang ;
Tao, Dacheng ;
Fu, Keren ;
Tu, Enmei ;
Yang, Jie .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) :2261-2274