Large-Scale Deep Belief Nets With MapReduce

被引:36
作者
Zhang, Kunlei [1 ]
Chen, Xue-Wen [1 ]
机构
[1] Wayne State Univ, Dept Comp Sci, Detroit, MI 48202 USA
来源
IEEE ACCESS | 2014年 / 2卷
关键词
Big data; deep learning; MapReduce; Hadoop; deep belief net (DBN); restricted Boltzmann machine (RBM);
D O I
10.1109/ACCESS.2014.2319813
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep belief nets (DBNs) with restricted Boltzmann machines (RBMs) as the building block have recently attracted wide attention due to their great performance in various applications. The learning of a DBN starts with pretraining a series of the RBMs followed by fine-tuning the whole net using backpropagation. Generally, the sequential implementation of both RBMs and backpropagation algorithm takes significant amount of computational time to process massive data sets. The emerging big data learning requires distributed computing for the DBNs. In this paper, we present a distributed learning paradigm for the RBMs and the backpropagation algorithm using MapReduce, a popular parallel programming model. Thus, the DBNs can be trained in a distributed way by stacking a series of distributed RBMs for pretraining and a distributed backpropagation for fine-tuning. Through validation on the benchmark data sets of various practical problems, the experimental results demonstrate that the distributed RBMs and DBNs are amenable to large-scale data with a good performance in terms of accuracy and efficiency.
引用
收藏
页码:395 / 403
页数:9
相关论文
共 25 条
  • [21] LEARNING REPRESENTATIONS BY BACK-PROPAGATING ERRORS
    RUMELHART, DE
    HINTON, GE
    WILLIAMS, RJ
    [J]. NATURE, 1986, 323 (6088) : 533 - 536
  • [22] Semantic hashing
    Salakhutdinov, Ruslan
    Hinton, Geoffrey
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2009, 50 (07) : 969 - 978
  • [23] An Efficient Hierarchical Clustering Method for Large Datasets with Map-Reduce
    Sun, Tianyang
    Shu, Chengchun
    Li, Feng
    Yu, Haiyan
    Ma, Lili
    Fang, Yitong
    [J]. 2009 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT 2009), 2009, : 494 - +
  • [24] Zhai Ke, 2012, P 21 INT C WORLD WID, P879, DOI 10.1145/2187836.2187955
  • [25] Zhao WZ, 2009, LECT NOTES COMPUT SC, V5931, P674, DOI 10.1007/978-3-642-10665-1_71