A pruning extreme learning machine with L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L_{2, 1/2}$$\end{document} regularization for multi-dimensional output problems

被引:0
作者
Yunwei Dai
Yuao Zhang
Qingbiao Wu
机构
[1] Zhejiang University,School of Mathematical Sciences
[2] Hangzhou City University,School of Information and Electrical Engineering
关键词
Extreme learning machine; regularization; Sparsity; Multi-dimensional output; Alternating direction method of multipliers; Distributed algorithm;
D O I
10.1007/s13042-023-01929-z
中图分类号
学科分类号
摘要
As a fast algorithm for training single-hidden layer feedforward neural networks, extreme learning machine (ELM) has been successfully applied to various classification and regression problems. In recent years, regularization techniques have been widely used in ELM to improve its stability, sparsity and generalization capability. In order to determine the appropriate number of hidden layer nodes, the ELM regularized by l1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l _{\,\!1/2 }$$\end{document} quasi-norm (L1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{1/2 }$$\end{document}-ELM) was developed to prune the redundant hidden nodes. However, in multi-dimensional output tasks, L1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{1/2 }$$\end{document}-ELM only removes redundant weights of hidden nodes but cannot guarantee the sparsity at the node level. In this paper, we present the L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM, which is regularized by L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{2,1/2 }$$\end{document} quasi-norm to achieve sparsity in multi-dimensional output problems. With the generalization of L1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{1/2 }$$\end{document} regularization to L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{2,1/2 }$$\end{document} regularization, L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM can prune the corresponding hidden nodes by setting some rows of the output weight matrix to zero. Since the proximal operator corresponding to L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$L _{2,1/2 }$$\end{document} regularization has a closed-form solution, the powerful alternating direction method of multipliers (ADMM) is employed to achieve a fast solution of L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM. Furthermore, to face the challenge of distributed computing, we extend L2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM to its distributed version, namely DL2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM. DL2,1/2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_{2,1/2 }$$\end{document}-ELM is solved by the consensus ADMM algorithm. Experiments on multi-classification and multi-target regression datasets demonstrate that our proposed algorithms can achieve competitive sparsity without compromising accuracy.
引用
收藏
页码:621 / 636
页数:15
相关论文
共 151 条
  • [1] Huang GB(2006)Extreme learning machine: theory and applications Neurocomputing 70 489-501
  • [2] Zhu QY(2021)Extreme learning machine versus classical feedforward network Neural Comput Appl 33 15121-15144
  • [3] Siew CK(2021)Balanced graph-based regularized semi-supervised extreme learning machine for EEG classification Int J Mach Learn Cybern 12 903-916
  • [4] Markowska-Kaczmar U(2019)A novel EEG-complexity-based feature and its application on the epileptic seizure detection Int J Mach Learn Cybern 10 3339-3348
  • [5] Kosturek M(2021)Machine learning based distributed big data analysis framework for next generation web in IoT Comput Sci Inf Syst 18 597-618
  • [6] She Q(2022)ELM-based data distribution model in ElasticChain World Wide Web 25 1085-1102
  • [7] Zou J(2019)Sensitive time series prediction using extreme learning machine Int J Mach Learn Cybern 10 3371-3386
  • [8] Meng M(2019)Online sequential class-specific extreme learning machine for binary imbalanced learning Neural Netw 119 235-248
  • [9] Zhang SL(2021)Remaining useful life prediction of integrated modular avionics using ensemble enhanced online sequential parallel extreme learning machine Int J Mach Learn Cybern 12 1893-1911
  • [10] Zhang B(2020)Extreme learning machine with hybrid cost function of G-mean and probability for imbalance learning Int J Mach Learn Cybern 11 2007-2020