Fast Kronecker Product Kernel Methods via Generalized Vec Trick

被引:16
作者
Airola, Antti [1 ]
Pahikkala, Tapio [1 ]
机构
[1] Univ Turku, Dept Future Technol, FI-20014 Turku, Finland
基金
芬兰科学院;
关键词
Bipartite graph learning; kernel methods; Kronecker product kernel; ridge regression; support vector machine (SVM); transfer learning; zero-shot learning; NEWTON METHOD; MACHINE;
D O I
10.1109/TNNLS.2017.2727545
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Kronecker product kernel provides the standard approach in the kernel methods' literature for learning from graph data, where edges are labeled and both start and end vertices have their own feature representations. The methods allow generalization to such new edges, whose start and end vertices do not appear in the training data, a setting known as zero-shot or zero-data learning. Such a setting occurs in numerous applications, including drug-target interaction prediction, collaborative filtering, and information retrieval. Efficient training algorithms based on the so-called vec trick that makes use of the special structure of the Kronecker product are known for the case where the training data are a complete bipartite graph. In this paper, we generalize these results to noncomplete training graphs. This allows us to derive a general framework for training Kronecker product kernel methods, as specific examples we implement Kronecker ridge regression and support vector machine algorithms. Experimental results demonstrate that the proposed approach leads to accurate models, while allowing order of magnitude improvements in training and prediction time.
引用
收藏
页码:3374 / 3387
页数:14
相关论文
共 67 条
[1]  
Airola A., 2010, P WORKSH PREF LEARN, P1
[2]   Training linear ranking SVMs in linearithmic time using red-black trees [J].
Airola, Antti ;
Pahikkala, Tapio ;
Salakoski, Tapio .
PATTERN RECOGNITION LETTERS, 2011, 32 (09) :1328-1336
[3]   Kernels for Vector-Valued Functions: A Review [J].
Alvarez, Mauricio A. ;
Rosasco, Lorenzo ;
Lawrence, Neil D. .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2012, 4 (03) :195-266
[4]  
[Anonymous], 2009, P 2009 SIAM INT C DA
[5]  
[Anonymous], 2004, P 21 INT C MACHINE L
[6]   Kernel methods for predicting protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BIOINFORMATICS, 2005, 21 :I38-I46
[7]  
Bonilla E. V., 2007, Artificial Intelligence and Statistics, P43
[8]  
Bottou C.-J., 2007, Large Scale Kernel Machines, V3, P301
[9]   Large-Scale Machine Learning with Stochastic Gradient Descent [J].
Bottou, Leon .
COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, :177-186
[10]   A LIMITED MEMORY ALGORITHM FOR BOUND CONSTRAINED OPTIMIZATION [J].
BYRD, RH ;
LU, PH ;
NOCEDAL, J ;
ZHU, CY .
SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1995, 16 (05) :1190-1208