Learning Optimized Structure of Neural Networks by Hidden Node Pruning With L1 Regularization

被引:38
作者
Xie, Xuetao [1 ]
Zhang, Huaqing [2 ]
Wang, Junze [3 ]
Chang, Qin [1 ]
Wang, Jian [1 ,3 ]
Pal, Nikhil R. [4 ]
机构
[1] China Univ Petr East China, Coll Sci, Qingdao 266580, Peoples R China
[2] China Univ Petr East China, Coll Control Sci & Engn, Qingdao 266580, Peoples R China
[3] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao 266580, Peoples R China
[4] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, W Bengal, India
基金
中国国家自然科学基金;
关键词
Convergence; neural network; pruning; regularization; smoothing approximation; GROUP LASSO; MULTILAYER PERCEPTRON; CONVERGENCE ANALYSIS; WEIGHT NOISE; ALGORITHM; CLASSIFICATION; REGRESSION; SELECTION; MACHINE;
D O I
10.1109/TCYB.2019.2950105
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose three different methods to determine the optimal number of hidden nodes based on L-1 regularization for a multilayer perceptron network. The first two methods, respectively, use a set of multiplier functions and multipliers for the hidden-layer nodes and implement the L-1 regularization on those, while the third method equipped with the same multipliers uses a smoothing approximation of the L-1 regularization. Each of these methods begins with a given number of hidden nodes, then the network is trained to obtain an optimal architecture discarding redundant hidden nodes using the multiplier functions or multipliers. A simple and generic method, namely, the matrixbased convergence proving method (MCPM), is introduced to prove the weak and strong convergence of the presented smoothing algorithms. The performance of the three pruning methods has been tested on 11 different classification datasets. The results demonstrate the efficient pruning abilities and competitive generalization by the proposed methods. The theoretical results are also validated by the results.
引用
收藏
页码:1333 / 1346
页数:14
相关论文
共 47 条
  • [1] [Anonymous], 2006, Journal of the Royal Statistical Society, Series B
  • [2] An iterative pruning algorithm for feedforward neural networks
    Castellano, G
    Fanelli, AM
    Pelillo, M
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1997, 8 (03): : 519 - 531
  • [3] Feature Selection Using a Neural Framework With Controlled Redundancy
    Chakraborty, Rudrasis
    Pal, Nikhil R.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) : 35 - 50
  • [4] Chang Q, 2018, 2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), P196, DOI 10.1109/SSCI.2018.8628632
  • [5] Chintalapudi KK, 1998, IEEE SYS MAN CYBERN, P2297, DOI 10.1109/ICSMC.1998.724998
  • [6] A constructive approach for nonlinear system identification using multilayer perceptrons
    Choi, JY
    VanLandingham, HF
    Bingulac, S
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1996, 26 (02): : 307 - 312
  • [7] Chung F. L., 1992, International Journal of Neural Systems, V3, P301, DOI 10.1142/S0129065792000231
  • [8] CATSMLP: Toward a robust and interpretable multilayer perceptron with sigmoid activation functions
    Chung, Fu-Lai
    Wang, Shitong
    Deng, Zhaohong
    Hu, Dewen
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2006, 36 (06): : 1319 - 1331
  • [9] Demuth H. B., 2014, NEURAL DESIGN
  • [10] Regularizing Multilayer Perceptron for Robustness
    Dey, Prasenjit
    Nag, Kaustuv
    Pal, Tandra
    Pal, Nikhil R.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (08): : 1255 - 1266