Regression and classification using extreme learning machine based on L1-norm and L2-norm

被引：112

作者：

Luo, Xiong ^{[1
]}

Chang, Xiaohui ^{[1
]}

Ban, Xiaojuan ^{[1
]}

机构：

[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing 100083, Peoples R China

来源：

NEUROCOMPUTING | 2016年 / 174卷

基金：

中国国家自然科学基金;

关键词：

Extreme learning machine; Ridge regression; Elastic net; Model selection; Bayesian information criterion (BIC); MODEL; SELECTION;

D O I：

10.1016/j.neucom.2015.03.112

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Extreme learning machine (ELM) is a very simple machine learning algorithm and it can achieve a good generalization performance with extremely fast speed. Therefore it has practical significance for data analysis in real-world applications. However, it is implemented normally under the empirical risk minimization scheme and it may tend to generate a large-scale and over-fitting model. In this paper, an ELM model based on L-1-norm and L-2-norm regularizations is proposed to handle regression and multiple-class classification problems in a unified framework. The proposed model called L-1-L-2-ELM combines the grouping effect benefits of L-2 penalty and the tendency towards sparse solution of L-1 penalty, thus it can control the complexity of the network and prevent over-fitting. To solve the mixed penalty problem, the separate elastic net algorithm and Bayesian information criterion (BIC) are adopted to find the optimal model for each response variable. We test the L-1-L-2-ELM algorithm on one artificial case and nine benchmark data sets to evaluate its performance. Simulation results have shown that the proposed algorithms outperform the original ELM as well as other advanced ELM algorithms in terms of prediction accuracy, and it is more robust in both regression and classification applications. (C) 2015 Elsevier B.V. All rights reserved.

引用

页码：179 / 186

页数：8

共 28 条

[1]

[Anonymous], 2009, P 26 ANN INT C MACH

[2]

Asuncion A., 2007, Uci machine learning repository

[3] MODEL SELECTION AND AKAIKE INFORMATION CRITERION (AIC) - THE GENERAL-THEORY AND ITS ANALYTICAL EXTENSIONS [J].

BOZDOGAN, H .

PSYCHOMETRIKA, 1987, 52 (03) :345-370

[4]

Breiman L, 1996, ANN STAT, V24, P2350

[5]

Brown B., 2011, McKinsey Quarterly, V4, P24

[6] Multimodel inference - understanding AIC and BIC in model selection [J].

Burnham, KP ;

Anderson, DR .

SOCIOLOGICAL METHODS & RESEARCH, 2004, 33 (02) :261-304

[7] Elastic-net regularization in learning theory [J].

De Mol, Christine ;

De Vito, Ernesto ;

Rosasco, Lorenzo .

JOURNAL OF COMPLEXITY, 2009, 25 (02) :201-230

[8] Regularized Extreme Learning Machine [J].

Deng, Wanyu ;

Zheng, Qinghua ;

Chen, Lin .

2009 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, 2009, :389-395

[9] An optimizing BP neural network algorithm based on genetic algorithm [J].

Ding, Shifei ;

Su, Chunyang ;

Yu, Junzhao .

ARTIFICIAL INTELLIGENCE REVIEW, 2011, 36 (02) :153-162

[10] Least angle regression - Rejoinder [J].

Efron, B ;

Hastie, T ;

Johnstone, I ;

Tibshirani, R .

ANNALS OF STATISTICS, 2004, 32 (02) :494-499

← 1 2 3 →