Gaussian Kernel Width Optimization for Sparse Bayesian Learning

被引：32

作者：

Mohsenzadeh, Yalda ^{[1
]}

Sheikhzadeh, Hamid ^{[1
]}

机构：

[1] Amirkabir Univ Technol, Dept Elect Engn, Tehran 1415854546, Iran

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2015年 / 26卷 / 04期

关键词：

Adaptive kernel learning (AKL); expectation maximization (EM); kernel width optimization; regression; relevance vector machine (RVM); sparse Bayesian learning; supervised kernel methods; MACHINE; POSE;

D O I：

10.1109/TNNLS.2014.2321134

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Sparse kernel methods have been widely used in regression and classification applications. The performance and the sparsity of these methods are dependent on the appropriate choice of the corresponding kernel functions and their parameters. Typically, the kernel parameters are selected using a cross-validation approach. In this paper, a learning method that is an extension of the relevance vector machine (RVM) is presented. The proposed method can find the optimal values of the kernel parameters during the training procedure. This algorithm uses an expectation-maximization approach for updating kernel parameters as well as other model parameters; therefore, the speed of convergence and computational complexity of the proposed method are the same as the standard RVM. To control the convergence of this fully parameterized model, the optimization with respect to the kernel parameters is performed using a constraint on these parameters. The proposed method is compared with the typical RVM and other competing methods to analyze the performance. The experimental results on the commonly used synthetic data, as well as benchmark data sets, demonstrate the effectiveness of the proposed method in reducing the performance dependency on the initial choice of the kernel parameters.

引用

页码：709 / 719

页数：11

共 28 条

[1]

Agarwal A, 2004, PROC CVPR IEEE, P882

[2]

Berger J.O., 1985, Statistical Decision Theory and Bayesian Analysis

[3] Generalized multiscale radial basis function networks [J].