ELM*: distributed extreme learning machine with MapReduce

被引:43
作者
Xin, Junchang [1 ,2 ]
Wang, Zhiqiong [3 ]
Chen, Chen [1 ,2 ]
Ding, Linlin [1 ,2 ]
Wang, Guoren [1 ,2 ]
Zhao, Yuhai [1 ,2 ]
机构
[1] Minist Educ, Key Lab Med Image Comp NEU, Shenyang, Peoples R China
[2] Northeastern Univ, Coll Informat Sci & Engn, Shenyang, Liaoning, Peoples R China
[3] Northeastern Univ, Sinodutch Biomed & Informat Engn Sch, Shenyang, Liaoning, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2014年 / 17卷 / 05期
基金
中国国家自然科学基金;
关键词
Extreme learning machine; Massive data processing; Cloud computing; MapReduce; APPROXIMATION; REGRESSION; FRAMEWORK; NETWORKS;
D O I
10.1007/s11280-013-0236-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extreme Learning Machine (ELM) has been widely used in many fields such as text classification, image recognition and bioinformatics, as it provides good generalization performance at a extremely fast learning speed. However, as the data volume in real-world applications becomes larger and larger, the traditional centralized ELM cannot learn such massive data efficiently. Therefore, in this paper, we propose a novel Distributed Extreme Learning Machine based on MapReduce framework, named ELM (auaEuro parts per thousand), which can cover the shortage of traditional ELM whose learning ability is weak to huge dataset. Firstly, after adequately analyzing the property of traditional ELM, it can be found out that the most expensive computation part of the matrix Moore-Penrose generalized inverse operator in the output weight vector calculation is the matrix multiplication operator. Then, as the matrix multiplication operator is decomposable, a Distributed Extreme Learning Machine (ELM (auaEuro parts per thousand)) based on MapReduce framework can be developed, which can first calculate the matrix multiplication effectively with MapReduce in parallel, and then calculate the corresponding output weight vector with centralized computing. Therefore, the learning of massive data can be made effectively. Finally, we conduct extensive experiments on synthetic data to verify the effectiveness and efficiency of our proposed ELM (auaEuro parts per thousand) in learning massive data with various experimental settings.
引用
收藏
页码:1189 / 1204
页数:16
相关论文
共 31 条
[1]  
[Anonymous], 2010, P 19 ACM INT S HIGH, DOI DOI 10.1145/1851476.1851593
[2]  
[Anonymous], 2003, P 19 ACM S OP SYST P, DOI [10.1145/1165389.945450, DOI 10.1145/1165389.945450]
[3]   Handwritten character recognition using wavelet energy and extreme learning machine [J].
Chacko, Binu P. ;
Krishnan, V. R. Vimal ;
Raju, G. ;
Anto, P. Babu .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2012, 3 (02) :149-161
[4]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[5]   MapReduce: A Flexible Data Processing Tool [J].
Dean, Jeffrey ;
Ghemawat, Sanjay .
COMMUNICATIONS OF THE ACM, 2010, 53 (01) :72-77
[6]  
Ghoting A, 2011, PROC INT CONF DATA, P231, DOI 10.1109/ICDE.2011.5767930
[7]   Parallel extreme learning machine for regression based on MapReduce [J].
He, Qing ;
Shang, Tianfeng ;
Zhuang, Fuzhen ;
Shi, Zhongzhi .
NEUROCOMPUTING, 2013, 102 :52-58
[8]  
Huang GB, 2005, PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, P232
[9]   Enhanced random search based incremental extreme learning machine [J].
Huang, Guang-Bin ;
Chen, Lei .
NEUROCOMPUTING, 2008, 71 (16-18) :3460-3468
[10]   Convex incremental extreme learning machine [J].
Huang, Guang-Bin ;
Chen, Lei .
NEUROCOMPUTING, 2007, 70 (16-18) :3056-3062