Learning kernel logistic regression in the presence of class label noise

被引:42
作者
Bootkrajang, Jakramate [1 ]
Kahan, Ata [1 ]
机构
[1] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
关键词
Classification; Label noise; Model selection; Multiple Kernel Learning; DISCRIMINANT-ANALYSIS; INITIAL SAMPLES; PATTERN-RECOGNITION; MODEL SELECTION; CLASSIFICATION; MISCLASSIFICATION; REGULARIZATION; MICROARRAYS; ALGORITHM;
D O I
10.1016/j.patcog.2014.05.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The classical machinery of supervised learning machines relies on a correct set of training labels. Unfortunately, there is no guarantee that all of the labels are correct. Labelling errors are increasingly noticeable in today's classification tasks, as the scale and difficulty of these tasks increases so much that perfect label assignment becomes nearly impossible. Several algorithms have been proposed to alleviate the problem of which a robust Kernel Fisher Discriminant is a successful example. However, for classification, discriminative models are of primary interest, and rather curiously, the very few existing label-robust discriminative classifiers are limited to linear problems. In this paper, we build on the widely used and successful kernelising technique to introduce a label-noise robust Kernel Logistic Regression classifier. The main difficulty that we need to bypass is how to determine the model complexity parameters when no trusted validation set is available. We propose to adapt the Multiple Kernel Learning approach for this new purpose, together with a Bayesian regularisation scheme. Empirical results on 13 benchmark data sets and two real-world applications demonstrate the success of our approach. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:3641 / 3655
页数:15
相关论文
共 47 条
[1]  
[Anonymous], 2006, Association for the Advancement of Artificial Intelligence
[2]  
[Anonymous], 2001, P 18 INT C MACHINE L
[3]  
Barandela R, 2000, LECT NOTES COMPUT SC, V1876, P621
[4]  
Biggio B., 2011, AS C MACH LEARN, P97
[5]  
Bootkrajang Jakramate, 2012, Machine Learning and Knowledge Discovery in Databases. Proceedings of the European Conference (ECML PKDD 2012), P143, DOI 10.1007/978-3-642-33460-3_15
[6]  
Bootkrajang J., 2011, P ESANN, P345
[7]   Classification of mislabelled microarrays using robust sparse logistic regression [J].
Bootkrajang, Jakramate ;
Kaban, Ata .
BIOINFORMATICS, 2013, 29 (07) :870-877
[8]   Robust supervised classification with mixture models: Learning from data with uncertain labels [J].
Bouveyron, Charles ;
Girard, Stephane .
PATTERN RECOGNITION, 2009, 42 (11) :2649-2658
[9]   Identifying mislabeled training data [J].
Brodley, CE ;
Friedl, MA .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1999, 11 :131-167
[10]  
Cawley GC, 2007, J MACH LEARN RES, V8, P841