Learning Optimized Structure of Neural Networks by Hidden Node Pruning With L1 Regularization

被引：38

作者：

Xie, Xuetao ^{[1
]}

Zhang, Huaqing ^{[2
]}

Wang, Junze ^{[3
]}

Chang, Qin ^{[1
]}

Wang, Jian ^{[1
,3
]}

Pal, Nikhil R. ^{[4
]}

机构：

[1] China Univ Petr East China, Coll Sci, Qingdao 266580, Peoples R China

[2] China Univ Petr East China, Coll Control Sci & Engn, Qingdao 266580, Peoples R China

[3] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao 266580, Peoples R China

[4] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, W Bengal, India

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2020年 / 50卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Convergence; neural network; pruning; regularization; smoothing approximation; GROUP LASSO; MULTILAYER PERCEPTRON; CONVERGENCE ANALYSIS; WEIGHT NOISE; ALGORITHM; CLASSIFICATION; REGRESSION; SELECTION; MACHINE;

D O I：

10.1109/TCYB.2019.2950105

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We propose three different methods to determine the optimal number of hidden nodes based on L-1 regularization for a multilayer perceptron network. The first two methods, respectively, use a set of multiplier functions and multipliers for the hidden-layer nodes and implement the L-1 regularization on those, while the third method equipped with the same multipliers uses a smoothing approximation of the L-1 regularization. Each of these methods begins with a given number of hidden nodes, then the network is trained to obtain an optimal architecture discarding redundant hidden nodes using the multiplier functions or multipliers. A simple and generic method, namely, the matrixbased convergence proving method (MCPM), is introduced to prove the weak and strong convergence of the presented smoothing algorithms. The performance of the three pruning methods has been tested on 11 different classification datasets. The results demonstrate the efficient pruning abilities and competitive generalization by the proposed methods. The theoretical results are also validated by the results.

引用

页码：1333 / 1346

页数：14

共 47 条

[1] [Anonymous], 2006, Journal of the Royal Statistical Society, Series B
[2] An iterative pruning algorithm for feedforward neural networks
Castellano, G
Fanelli, AM
Pelillo, M
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (03): : 519 - 531
[3] Feature Selection Using a Neural Framework With Controlled Redundancy
Chakraborty, Rudrasis
Pal, Nikhil R.
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) : 35 - 50
[4] Chang Q, 2018, 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), P196, DOI 10.1109/SSCI.2018.8628632
[5] Chintalapudi KK, 1998, IEEE SYS MAN CYBERN, P2297, DOI 10.1109/ICSMC.1998.724998
[6] A constructive approach for nonlinear system identification using multilayer perceptrons
Choi, JY
VanLandingham, HF
Bingulac, S
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1996, 26 (02): : 307 - 312
[7] Chung F. L., 1992, International Journal of Neural Systems, V3, P301, DOI 10.1142/S0129065792000231
[8] CATSMLP: Toward a robust and interpretable multilayer perceptron with sigmoid activation functions
Chung, Fu-Lai
Wang, Shitong
Deng, Zhaohong
Hu, Dewen
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2006, 36 (06): : 1319 - 1331
[9] Demuth H. B., 2014, NEURAL DESIGN
[10] Regularizing Multilayer Perceptron for Robustness
Dey, Prasenjit
Nag, Kaustuv
Pal, Tandra
Pal, Nikhil R.
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (08): : 1255 - 1266

← 1 2 3 4 5 →