When Is There a Representer Theorem? Vector Versus Matrix Regularizers

被引:0
作者
Argyriou, Andreas [1 ]
Micchelli, Charles A. [2 ]
Pontil, Massimiliano [1 ]
机构
[1] UCL, Dept Comp Sci, London WC1E 6BT, England
[2] SUNY Albany, Dept Math & Stat, Albany, NY 12222 USA
基金
英国工程与自然科学研究理事会;
关键词
kernel methods; matrix learning; minimal norm interpolation; multi-task learning; regularization; MULTIPLE TASKS; NETWORKS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider a general class of regularization methods which learn a vector of parameters on the basis of linear measurements. It is well known that if the regularizer is a nondecreasing function of the L-2 norm, then the learned vector is a linear combination of the input data. This result, known as the representer theorem, lies at the basis of kernel-based methods in machine learning. In this paper, we prove the necessity of the above condition, in the case of differentiable regularizers. We further extend our analysis to regularization methods which learn a matrix, a problem which is motivated by the application to multi-task learning. In this context, we study a more general representer theorem, which holds for a larger class of regularizers. We provide a necessary and sufficient condition characterizing this class of matrix regularizers and we highlight some concrete examples of practical importance. Our analysis uses basic principles from matrix theory, especially the useful notion of matrix nondecreasing functions.
引用
收藏
页码:2507 / 2529
页数:23
相关论文
共 48 条
[1]  
Aaker D A., 2004, Marketing Research
[2]  
Abernethy J, 2009, J MACH LEARN RES, V10, P803
[3]  
Amit Y., 2007, P 24 INT C MACH LEAR
[4]  
Ando RK, 2005, J MACH LEARN RES, V6, P1817
[5]  
[Anonymous], 2004, KERNEL METHODS PATTE
[6]  
[Anonymous], 1987, P CBMS NSF REG C SER
[7]  
[Anonymous], P 14 ANN C COMP LEAR
[8]  
[Anonymous], 2000, NATURE STAT LEARNING, DOI DOI 10.1007/978-1-4757-3264-1
[9]  
[Anonymous], P 23 INT C MACH LEAR
[10]  
ARGYRIOU A, 2007, ADV NEURAL INFORM PR, V20