Adaptive Kernel Methods Using the Balancing Principle

被引:38
作者
De Vito, E. [2 ,3 ]
Pereverzyev, S. [4 ]
Rosasco, L. [1 ,5 ]
机构
[1] MIT, Ctr Biol & Computat Learning, Cambridge, MA 02139 USA
[2] Univ Genoa, DSA, Genoa, Italy
[3] Ist Nazl Fis Nucl, I-16146 Genoa, Italy
[4] Austrian Acad Sci, Johann Radon Inst Computat & Appl Math, A-4040 Linz, Austria
[5] Univ Genoa, DISI, Genoa, Italy
关键词
Learning Theory; Model Selection; Adaptive Regularization; Inverse Problems; REGULARIZATION ALGORITHMS; SELECTION; RATES;
D O I
10.1007/s10208-010-9064-2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The regularization parameter choice is a fundamental problem in Learning Theory since the performance of most supervised algorithms crucially depends on the choice of one or more of such parameters. In particular a main theoretical issue regards the amount of prior knowledge needed to choose the regularization parameter in order to obtain good learning rates. In this paper we present a parameter choice strategy, called the balancing principle, to choose the regularization parameter without knowledge of the regularity of the target function. Such a choice adaptively achieves the best error rate. Our main result applies to regularization algorithms in reproducing kernel Hilbert space with the square loss, though we also study how a similar principle can be used in other situations. As a straightforward corollary we can immediately derive adaptive parameter choices for various kernel methods recently studied. Numerical experiments with the proposed parameter choice rules are also presented.
引用
收藏
页码:455 / 479
页数:25
相关论文
共 45 条
[1]  
[Anonymous], 2000, University of Paisley Technical Report
[2]   THEORY OF REPRODUCING KERNELS [J].
ARONSZAJN, N .
TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1950, 68 (MAY) :337-404
[3]   Risk bounds for model selection via penalization [J].
Barron, A ;
Birgé, L ;
Massart, P .
PROBABILITY THEORY AND RELATED FIELDS, 1999, 113 (03) :301-413
[4]  
BARTLETT PL, 2000, P 13 ANN C COMP LEAR, P286
[5]   On regularization algorithms in learning theory [J].
Bauer, Frank ;
Pereverzev, Sergei ;
Rosasco, Lorenzo .
JOURNAL OF COMPLEXITY, 2007, 23 (01) :52-72
[6]   Theory of classification: A survey of some recent advances [J].
Boucheron, Stéphane ;
Bousquet, Olivier ;
Lugosi, Gábor .
ESAIM - Probability and Statistics, 2005, 9 :323-375
[7]   Boosting with the L2 loss:: Regression and classification [J].
Bühlmann, P ;
Yu, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (462) :324-339
[8]  
Caponnetto A, 2007, FOUND COMPUT MATH, V7, P331, DOI 10.1007/S10208-006-0196-8
[9]   Consistency and robustness of kernel-based regression in convex risk minimization [J].
Christmann, Andreas ;
Steinwart, Ingo .
BERNOULLI, 2007, 13 (03) :799-819
[10]  
Cucker F, 2007, C MO AP C M, P1, DOI 10.1017/CBO9780511618796