Kernel online learning with adaptive kernel width

被引:30
作者
Fan, Haijin [1 ]
Song, Qing [1 ]
Shrestha, Sumit B. [2 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] ASTAR, Inst High Performance Comp, Singapore 138632, Singapore
关键词
Online learning; Kernel width; Adaptive learning; Cumulative coherence; Convergence; ALGORITHM;
D O I
10.1016/j.neucom.2015.10.055
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper discusses a unified framework for kernel online learning (KOL) algorithm with adaptive kernels. Unlike the traditional KOL algorithms which applied a fixed kernel width in the training process, the kernel width is considered as an additional free parameter and can be adapted automatically. A robust training method is proposed based on an adaptive dead zone scheme. The kernel weight and the kernel width are updated under a unified framework, where they share the same learning parameters. We present a theoretical convergence analysis of the proposed adaptive training method which can switch off the learning when the training error is too small in terms of external disturbance. Meanwhile, in the regularization of the kernel function number, an in-depth measure concept: the cumulative coherence is applied. A dictionary with predefined size is selected by online minimization of its cumulative coherence without using any parameters related to the prior knowledge of the training samples. Simulation results show that the proposed algorithm can adapt the training data effectively with different initial kernel width. Its performance could be better in both testing accuracy and convergence speed compared with the kernel algorithms with a fixed kernel width. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:233 / 242
页数:10
相关论文
共 32 条
[1]   Learning by kernel polarization [J].
Baram, Y .
NEURAL COMPUTATION, 2005, 17 (06) :1264-1275
[2]   On the kernel widths in radial-basis function networks [J].
Benoudjit, N ;
Verleysen, M .
NEURAL PROCESSING LETTERS, 2003, 18 (02) :139-154
[3]   Decay properties of restricted isometry constants [J].
Blanchard, Jeffey D. ;
Cartis, Coralia ;
Tanner, Jared .
IEEE Signal Processing Letters, 2009, 16 (07) :572-575
[4]   A tutorial on Support Vector Machines for pattern recognition [J].
Burges, CJC .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (02) :121-167
[5]   Kernel methods: a survey of current techniques [J].
Campbell, C .
NEUROCOMPUTING, 2002, 48 :63-84
[6]   Quantized Kernel Least Mean Square Algorithm [J].
Chen, Badong ;
Zhao, Songlin ;
Zhu, Pingping ;
Principe, Jose C. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2012, 23 (01) :22-32
[7]   ROBUSTNESS ANALYSIS OF DISCRETE-TIME ADAPTIVE-CONTROL SYSTEMS USING INPUT-OUTPUT STABILITY THEORY - A TUTORIAL [J].
CLUETT, WR ;
SHAH, SL ;
FISHER, DG .
IEE PROCEEDINGS-D CONTROL THEORY AND APPLICATIONS, 1988, 135 (02) :133-141
[8]  
Cristianini N, 2006, STUD FUZZ SOFT COMP, V194, P205
[9]   The kernel recursive least-squares algorithm [J].
Engel, Y ;
Mannor, S ;
Meir, R .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2004, 52 (08) :2275-2285
[10]   A sparse kernel algorithm for online time series data prediction [J].
Fan, Haijin ;
Song, Qing .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (06) :2174-2181