On the Robustness of Decision Tree Learning Under Label Noise

被引:19
作者
Ghosh, Aritra [1 ]
Manwani, Naresh [2 ]
Sastry, P. S. [3 ]
机构
[1] Microsoft, Bangalore, Karnataka, India
[2] Int Inst Informat Technol, Hyderabad, India
[3] Indian Inst Sci, Bangalore, Karnataka, India
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT I | 2017年 / 10234卷
关键词
Robust learning; Decision trees; Label noise; CLASSIFICATION;
D O I
10.1007/978-3-319-57454-7_53
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In most practical problems of classifier learning, the training data suffers from label noise. Most theoretical results on robustness to label noise involve either estimation of noise rates or non-convex optimization. Further, none of these results are applicable to standard decision tree learning algorithms. This paper presents some theoretical analysis to show that, under some assumptions, many popular decision tree learning algorithms are inherently robust to label noise. We also present some sample complexity results which provide some bounds on the sample size for the robustness to hold with a high probability. Through extensive simulations we illustrate this robustness.
引用
收藏
页码:685 / 697
页数:13
相关论文
共 19 条
[1]  
[Anonymous], 2013, C LEARN THEOR
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]   Identifying mislabeled training data [J].
Brodley, CE ;
Friedl, MA .
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1999, 11 :131-167
[4]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[5]  
du Plessis MC, 2014, ADV NEUR IN, V27
[6]   Classification in the Presence of Label Noise: a Survey [J].
Frenay, Benoit ;
Verleysen, Michel .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) :845-869
[7]   Making risk minimization tolerant to label noise [J].
Ghosh, Aritra ;
Manwani, Naresh ;
Sastry, P. S. .
NEUROCOMPUTING, 2015, 160 :93-107
[8]  
LICHMAN M., 2013, UCI MACHINE LEARNING
[9]   Random classification noise defeats all convex potential boosters [J].
Long, Philip M. ;
Servedio, Rocco A. .
MACHINE LEARNING, 2010, 78 (03) :287-304
[10]   Noise Tolerance Under Risk Minimization [J].
Manwani, Naresh ;
Sastry, P. S. .
IEEE TRANSACTIONS ON CYBERNETICS, 2013, 43 (03) :1146-1151