Building feature space of extreme learning machine with sparse denoising stacked-autoencoder

被引:48
作者
Cao, Le-le [1 ]
Huang, Wen-bing [1 ]
Sun, Fu-chun [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol TNList, Beijing 100084, Peoples R China
关键词
Extreme learning machine (ELM); Ridge regression; Feature space; Stacked autoencoder (SAE); Classification; Regression; FACE RECOGNITION; BELIEF NETWORKS; DEEP; REGRESSION;
D O I
10.1016/j.neucom.2015.02.096
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The random-hidden-node extreme learning machine (ELM) is a much more generalized cluster of single-hidden-layer feed-forward neural networks (SLFNs) which has three parts: random projection, nonlinear transformation, and ridge regression (RR) model. Networks with deep architectures have demonstrated state-of-the-art performance in a variety of settings, especially with computer vision tasks. Deep learning algorithms such as stacked autoencoder (SAE) and deep belief network (DEN) are built on learning several levels of representation of the input. Beyond simply learning features by stacking autoencoders (AE), there is a need for increasing its robustness to noise and reinforcing the sparsity of weights to make it easier to discover interesting and prominent features. The sparse AE and denoising AE was hence developed for this purpose. This paper proposes an approach: SSDAE-RR (stacked sparse denoising autoencoder - ridge regression) that effectively integrates the advantages in SAE, sparse AE, denoising AE, and the RR implementation in ELM algorithm. We conducted experimental study on real-world classification (binary and multiclass) and regression problems with different scales among several relevant approaches: SSDAE-RR, ELM, DBN, neural network (NN), and SAE. The performance analysis shows that the SSDAE-RR tends to achieve a better generalization ability on relatively large datasets (large sample size and high dimension) that were not pre-processed for feature abstraction. For 16 out of 18 tested datasets, the performance of SSDAE-RR is more stable than other tested approaches. We also note that the sparsity regularization and denoising mechanism seem to be mandatory for constructing interpretable feature representations. The fact that a SSDAE-RR approach often has a comparable training time to ELM makes it useful in some real applications. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:60 / 71
页数:12
相关论文
共 58 条
  • [21] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537
  • [22] Guyon I., 2004, Advances in Neural Information Processing Systems, V17
  • [23] HEDONIC HOUSING PRICES AND DEMAND FOR CLEAN-AIR
    HARRISON, D
    RUBINFELD, DL
    [J]. JOURNAL OF ENVIRONMENTAL ECONOMICS AND MANAGEMENT, 1978, 5 (01) : 81 - 102
  • [24] Reducing the dimensionality of data with neural networks
    Hinton, G. E.
    Salakhutdinov, R. R.
    [J]. SCIENCE, 2006, 313 (5786) : 504 - 507
  • [25] Learning multiple a layers of representation
    Hinton, Geoffrey E.
    [J]. TRENDS IN COGNITIVE SCIENCES, 2007, 11 (10) : 428 - 434
  • [26] A fast learning algorithm for deep belief nets
    Hinton, Geoffrey E.
    Osindero, Simon
    Teh, Yee-Whye
    [J]. NEURAL COMPUTATION, 2006, 18 (07) : 1527 - 1554
  • [27] Hinton Geoffrey E, 2008, ADV NEURAL INFORM PR, V20, P1249
  • [28] RIDGE REGRESSION - BIASED ESTIMATION FOR NONORTHOGONAL PROBLEMS
    HOERL, AE
    KENNARD, RW
    [J]. TECHNOMETRICS, 1970, 12 (01) : 55 - &
  • [29] Sketched symbol recognition using Zernike moments
    Hse, H
    Newton, AR
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, 2004, : 367 - 370
  • [30] Huang GB, 2004, IEEE IJCNN, P985