Boundedness and convergence analysis of weight elimination for cyclic training of neural networks

被引：8

作者：

Wang, Jian ^{[1
,3
]}

Ye, Zhenyun ^{[2
]}

Gao, Weifeng ^{[1
]}

Zurada, Jacek M. ^{[3
,4
]}

机构：

[1] China Univ Petr, Coll Sci, Qingdao 266580, Peoples R China

[2] China Univ Petr, Coll Comp & Commun Engn, Qingdao 266580, Peoples R China

[3] Univ Louisville, Dept Elect & Comp Engn, Louisville, KY 40292 USA

[4] Univ Social Sci, Inst Informat Technol, PL-90113 Lodz, Poland

来源：

NEURAL NETWORKS | 2016年 / 82卷

基金：

中国国家自然科学基金; 高等学校博士学科点专项科研基金; 中国博士后科学基金;

关键词：

Neural networks; Weight decay; Weight elimination; Boundedness; Convergence; PENALTY; ONLINE;

D O I：

10.1016/j.neunet.2016.06.005

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Weight elimination offers a simple and efficient improvement of training algorithm of feedforward neural networks. It is a general regularization technique in terms of the flexible scaling parameters. Actually, the weight elimination technique also contains the weight decay regularization for a large scaling parameter. Many applications of this technique and its improvements have been reported. However, there is little research concentrated on its convergence behavior. In this paper, we theoretically analyze the weight elimination for cyclic learning method and determine the conditions for the uniform boundedness of weight sequence, and weak and strong convergence. Based on the assumed network parameters, the optimal choice for the scaling parameter can also be determined. Moreover, two illustrative simulations have been done to support the theoretical explorations as well. (C) 2016 Elsevier Ltd. All rights reserved.

引用

页码：49 / 61

页数：13

共 32 条

[1]

[Anonymous], 2014, Proceedings of the IEEE Symposium on Swarm Intelligence, DOI DOI 10.1109/SIS.2014.7011773

[2]

[Anonymous], 2008, PATTERN RECOGNITION

[3]

[Anonymous], 1994, Neural networks: a comprehensive foundation

[4]

Bebis G, 1996, IEEE IJCNN, P1115, DOI 10.1109/ICNN.1996.549054

[5] Gradient convergence in gradient methods with errors [J].

Bertsekas, DP ;

Tsitsiklis, JN .

SIAM JOURNAL ON OPTIMIZATION, 2000, 10 (03) :627-642

[6] Exponential H a filtering analysis for discrete-time switched neural networks with random delays using sojourn probabilities [J].

Cao JinDe ;

Rakkiyappan, R. ;

Maheswari, K. ;

Chandrasekar, A. .

SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2016, 59 (03) :387-402

[7]

Cherkassky V, 2007, LEARNING DATA CONCEP

[8] Weight-elimination neural networks applied to coronary surgery mortality prediction [J].

Ennett, CM ;

Frize, M .

IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2003, 7 (02) :86-92

[9]

Friedman J.H., 1994, STAT NEURAL NETWORKS, P1, DOI DOI 10.1007/978-3-642-79119-2_1

[10] Clinical decision support systems for intensive care units: using artificial neural networks [J].

Frize, M ;

Ennett, CM ;

Stevenson, M ;

Trigg, HCE .

MEDICAL ENGINEERING & PHYSICS, 2001, 23 (03) :217-225

← 1 2 3 4 →