Fast cross-validation algorithms for least squares support vector machine and kernel ridge regression

被引:270
作者
An, Senjian [1 ]
Liu, Wanquan [1 ]
Venkatesh, Svetha [1 ]
机构
[1] Curtin Univ Technol, Dept Comp, Perth, WA 6845, Australia
关键词
model selection; cross-validation; kernel methods;
D O I
10.1016/j.patcog.2006.12.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given n training examples, the training of a least squares support vector machine (LS-SVM) or kernel ridge regression (KRR) corresponds to solving a linear system of dimension n. In cross-validating LS-SVM or KRR, the training examples are split into two distinct subsets for a number of times (1) wherein a subset of in examples are used for validation and the other subset of (n - in) examples are used for training the classifier. In this case I linear systems of dimension (n - m) need to be solved. We propose a novel method for cross-validation (CV) of LS-SVM or KRR in which instead of solving I linear systems of dimension (n - m), we compute the inverse of an n dimensional square matrix and solve 1 linear systems of dimension in, thereby reducing the complexity when 1 is large and/or in is small. Typical multi-fold, leave-one-out cross-validation (LOO-CV) and leave-many-out cross-validations are considered. For five-fold CV used in practice with five repetitions over randomly drawn slices, the proposed algorithm is approximately four times as efficient as the naive implementation. For large data sets, we propose to evaluate the CV approximately by applying the well-known incomplete Cholesky decomposition technique and the complexity of these approximate algorithms will scale linearly on the data size if the rank of the associated kernel matrix is much smaller than n. Simulations are provided to demonstrate the performance of LS-SVM and the efficiency of the proposed algorithm with comparisons to the naive and some existent implementations of multi-fold and LOO-CV. (C) 2007 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:2154 / 2162
页数:9
相关论文
共 21 条
[1]  
[Anonymous], P 15 INT C MACH LEAR
[2]  
Bach FR, 2002, J MACH LEARN RES, V3, P1
[3]  
Bach Francis R, 2005, P 22 INT C MACHINE L, P33, DOI DOI 10.1145/1102351.1102356
[4]  
Blake C., 1998, UCI REPOSITORY MACHI
[5]   Fast exact leave-one-out cross-validation of sparse least-squares support vector machines [J].
Cawley, GC ;
Talbot, NLC .
NEURAL NETWORKS, 2004, 17 (10) :1467-1475
[6]   Efficient leave-one-out cross-validation of kernel Fisher discriminant classifiers [J].
Cawley, GC ;
Talbot, NLC .
PATTERN RECOGNITION, 2003, 36 (11) :2585-2592
[7]  
CHAPELLE O, 1999, ADV NEURAL INFORMATI
[8]  
Cherkassky V, 1997, IEEE Trans Neural Netw, V8, P1564, DOI 10.1109/TNN.1997.641482
[9]   Efficient SVM training using low-rank kernel representations [J].
Fine, S ;
Scheinberg, K .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :243-264
[10]  
Jaakkola TS, 1999, P 1999 C AI STAT