Parallel ensemble of online sequential extreme learning machine based on MapReduce

被引:40
作者
Huang, Shan [1 ]
Wang, Botao [1 ]
Qiu, Junhao [1 ]
Yao, Jitao [1 ]
Wang, Guoren [1 ]
Yu, Ge [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Shenyang 110004, Liaoning, Peoples R China
基金
中国国家自然科学基金;
关键词
Parallel learning; Ensemble; Extreme learning machine; Map Reduce; Sequential learning; REGRESSION;
D O I
10.1016/j.neucom.2015.04.105
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this era of big data, analyzing large scale data efficiently and accurately has become a challenging problem. As one of the ELM variants, online sequential extreme learning machine (OS-ELM) provides a method to analyze incremental data. Ensemble methods provide a way to learn from data more accurately. MapReduce, which provides a simple, scalable and fault-tolerant framework, can be utilized for large scale learning. In this paper, we first propose an ensemble OS-ELM framework which supports any combination of bagging, subspace partitioning and cross validation. Then we design a parallel ensemble of online sequential extreme learning machine (PEOS-ELM) algorithm based on MapReduce for large scale learning. PEOS-ELM algorithm is evaluated with real and synthetic data with the maximum number of training data 5120K and the maximum number of attributes 512. The speedup of this algorithm reaches as high as 40 on a cluster with maximum 80 cores. The accuracy of PEOS-ELM algorithm is at the same level as that of ensemble OS-ELM executing on a single machine, which is higher than that of the original OS-ELM. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:352 / 367
页数:16
相关论文
共 25 条
[1]   A MapReduce-based distributed SVM ensemble for scalable image classification and annotation [J].
Alham, Nasullah Khalid ;
Li, Maozhen ;
Liu, Yang ;
Qi, Man .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2013, 66 (10) :1920-1934
[2]  
[Anonymous], 2012, Hadoop: The definitive guide
[3]  
[Anonymous], 2003, P 19 ACM S OP SYST P, DOI [10.1145/1165389.945450, DOI 10.1145/1165389.945450]
[4]  
[Anonymous], 2014, P 2014 WORKSH ART IN, DOI DOI 10.1145/2666652.2666664
[5]   Towards a Next-Generation Matrix Library for Java']Java [J].
Arndt, Holger ;
Bundschus, Markus ;
Naegele, Andreas .
2009 IEEE 33RD INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, VOLS 1 AND 2, 2009, :460-+
[6]  
Chen JX, 2014, NANO RES, P1
[7]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[8]   Parallel extreme learning machine for regression based on MapReduce [J].
He, Qing ;
Shang, Tianfeng ;
Zhuang, Fuzhen ;
Shi, Zhongzhi .
NEUROCOMPUTING, 2013, 102 :52-58
[9]   A hierarchical structure of extreme learning machine (HELM) for high-dimensional datasets with noise [J].
He, Yan-Lin ;
Geng, Zhi-Qiang ;
Xu, Yuan ;
Zhu, Qun-Xiong .
NEUROCOMPUTING, 2014, 128 :407-414
[10]  
Huang G.-B., 2004, ICIS032004 NAN U SCH, DOI School of Electrical and Electronic Engineering, Nanyang Technological University