A pruning extreme learning machine with L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_{2, 1/2}$$\end{document} regularization for multi-dimensional output problems

被引:0
作者
Yunwei Dai
Yuao Zhang
Qingbiao Wu
机构
[1] Zhejiang University,School of Mathematical Sciences
[2] Hangzhou City University,School of Information and Electrical Engineering
关键词
Extreme learning machine; regularization; Sparsity; Multi-dimensional output; Alternating direction method of multipliers; Distributed algorithm;
D O I
10.1007/s13042-023-01929-z
中图分类号
学科分类号
摘要
As a fast algorithm for training single-hidden layer feedforward neural networks, extreme learning machine (ELM) has been successfully applied to various classification and regression problems. In recent years, regularization techniques have been widely used in ELM to improve its stability, sparsity and generalization capability. In order to determine the appropriate number of hidden layer nodes, the ELM regularized by l1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l _{\,\!1/2 }$$\end{document} quasi-norm (L1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{1/2 }$$\end{document}-ELM) was developed to prune the redundant hidden nodes. However, in multi-dimensional output tasks, L1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{1/2 }$$\end{document}-ELM only removes redundant weights of hidden nodes but cannot guarantee the sparsity at the node level. In this paper, we present the L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM, which is regularized by L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{2,1/2 }$$\end{document} quasi-norm to achieve sparsity in multi-dimensional output problems. With the generalization of L1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{1/2 }$$\end{document} regularization to L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{2,1/2 }$$\end{document} regularization, L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM can prune the corresponding hidden nodes by setting some rows of the output weight matrix to zero. Since the proximal operator corresponding to L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{2,1/2 }$$\end{document} regularization has a closed-form solution, the powerful alternating direction method of multipliers (ADMM) is employed to achieve a fast solution of L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM. Furthermore, to face the challenge of distributed computing, we extend L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM to its distributed version, namely DL2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM. DL2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM is solved by the consensus ADMM algorithm. Experiments on multi-classification and multi-target regression datasets demonstrate that our proposed algorithms can achieve competitive sparsity without compromising accuracy.
引用
收藏
页码:621 / 636
页数:15
相关论文
共 151 条
  • [51] Chen X(2017)Efficient and differentiable low-rank matrix completion with back propagation Inform Sci 415 undefined-undefined
  • [52] Zhang J(2011)Learning deep sparse regularizers with applications to multi-view clustering and semi-supervised classification IEEE Trans Neural Netw 22 undefined-undefined
  • [53] Shen L(2022)HSR: Int J Inf Technol Decis Making 21 undefined-undefined
  • [54] Suter BW(2022)-regularized sparse representation for fast face recognition using hierarchical feature selection IEEE Trans Circuits Syst Video Technol 32 undefined-undefined
  • [55] Tripp EE(2023)A nonconvex pansharpening model with spatial and spectral gradient difference-induced nonconvex sparsity priors J Math Imaging Vis 65 undefined-undefined
  • [56] Xu Z(undefined)Distributed electrical capacitance tomography reconstruction with data and sparsity priors undefined undefined undefined-undefined
  • [57] Chang X(undefined)On the Moore-Penrose generalized inverse matrix undefined undefined undefined-undefined
  • [58] Xu F(undefined)Regression and multiclass classification using sparse extreme learning machine via smoothing group undefined undefined undefined-undefined
  • [59] Cao W(undefined) regularizer undefined undefined undefined-undefined
  • [60] Sun J(undefined)Adaptive undefined undefined undefined-undefined