Large scale multiple kernel learning

被引:0
作者
Sonnenburg, Soeren
Raetsch, Gunnar
Schaefer, Christin
Schoelkopf, Bernhard
机构
[1] Fraunhofer FIRST IDA, D-12489 Berlin, Germany
[2] Max Planck Gesell, Friedrich Miescher Lab, Tubingen, Germany
[3] Max Planck Inst Biol Cybernet, D-72076 Tubingen, Germany
关键词
multiple kernel learning; string kernels; large scale optimization; support vector machines; support vector regression; column generation; semi-infinite linear programming;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While classical kernel-based learning algorithms are based on a single kernel, in practice it is often desirable to use multiple kernels. Lanckriet et al. (2004) considered conic combinations of kernel matrices for classification, leading to a convex quadratically constrained quadratic program. We show that it can be rewritten as a semi-infinite linear program that can be efficiently solved by recycling the standard SVM implementations. Moreover, we generalize the formulation and our method to a larger class of problems, including regression and one-class classification. Experimental results show that the proposed algorithm works for hundred thousands of examples or hundreds of kernels to be combined, and helps for automatic model selection, improving the interpretability of the learning result. In a second part we discuss general speed up mechanism for SVMs, especially when used with sparse feature maps as appear for string kernels, allowing us to train a string kernel SVM on a 10 million real-world splice data set from computational biology. We integrated multiple kernel learning in our machine learning toolbox SHOGUN for which the source code is publicly available at http://www.fml.tuebingen.mpg.de/raetsch/projects/shogun.
引用
收藏
页码:1531 / 1565
页数:35
相关论文
共 30 条
[1]  
[Anonymous], 2003, HP INVENT
[2]  
[Anonymous], 1998, Encyclopedia of Biostatistics
[3]  
Bach FR, 2004, 21 INT C MACH LEARN, P6
[4]  
Bennett BC, 2002, J SPORT EXERCISE PSY, V24, P31
[5]  
Bi J., 2004, ACM SIGKDD INT C KNO, P521
[6]  
Boyd S., 2004, CONVEX OPTIMIZATION
[7]   Choosing multiple parameters for support vector machines [J].
Chapelle, O ;
Vapnik, V ;
Bousquet, O ;
Mukherjee, S .
MACHINE LEARNING, 2002, 46 (1-3) :131-159
[8]  
CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
[9]  
Davis J., 2006, 1551 U WISC MAD
[10]   TRIE MEMORY [J].
FREDKIN, E .
COMMUNICATIONS OF THE ACM, 1960, 3 (09) :490-499