Handling missing values in support vector machine classifiers

被引:143
作者
Pelckmans, K
De Brabanter, J
Suykens, JAK
De Moor, B
机构
[1] Katholieke Univ Leuven, ESAT, SCD, SISTA, B-3001 Louvain, Belgium
[2] Katholieke Univ Leuven, Hogesch KaHo St Lieven, B-9000 Ghent, Belgium
关键词
D O I
10.1016/j.neunet.2005.06.025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses the task of learning a classifier from observed data containing missing values amongst the inputs which are missing completely at random(1). A non-parametric perspective is adopted by defining a modified risk taking into account the uncertainty of the predicted outputs when missing values are involved. It is shown that this approach generalizes the approach of mean imputation in the linear case and the resulting kernel machine reduces to the standard Support Vector Machine (SVM) when no input values are missing. Furthermore, the method is extended to the multivariate case of fitting additive models using componentwise kernel machines, and an efficient implementation is based on the Least Squares Support Vector Machine (LS-SVM) classifier formulation. (c) 2005 Elsevier Ltd. All rights reserved.
引用
收藏
页码:684 / 692
页数:9
相关论文
共 22 条
[1]  
[Anonymous], MATH STAT
[2]  
[Anonymous], 1999, ADV KERNEL METHODS S
[3]  
[Anonymous], [No title captured]
[4]  
[Anonymous], 2002, Least Squares Support Vector Machines
[5]  
Bishop C. M., 1996, Neural networks for pattern recognition
[6]  
BOUSQUET O, 2004, ADV LECT MACHINE LEA, P3176
[7]  
Boyd S., 2004, CONVEX OPTIMIZATION
[8]  
Cristianini N., 2000, Intelligent Data Analysis: An Introduction, DOI 10.1017/CBO9780511801389
[9]  
Friedman J., 2001, The elements of statistical learning, V1, DOI DOI 10.1007/978-0-387-21606-5
[10]  
Hastie T., 1990, Generalized additive model