IMapC: Inner MAPping Combiner to Enhance the Performance of MapReduce in Hadoop

被引:4
作者
Kavitha, C. [1 ]
Srividhya, S. R. [1 ]
Lai, Wen-Cheng [2 ,3 ]
Mani, Vinodhini [1 ]
机构
[1] Sathyabama Inst Sci & Technol, Dept Comp Sci & Engn, Chennai 600119, Tamil Nadu, India
[2] Natl Yunlin Univ Sci & Technol, Bachelor Program Ind Projects, Touliu 640301, Yunlin, Taiwan
[3] Natl Yunlin Univ Sci & Technol, Dept Elect Engn, Touliu 640301, Yunlin, Taiwan
关键词
big data; combiner; distributed storage; hadoop; mapreduce; sort; task failure resilience; wordcount;
D O I
10.3390/electronics11101599
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hadoop is a framework for storing and processing huge amounts of data. With HDFS, large data sets can be managed on commodity hardware. MapReduce is a programming model for processing vast amounts of data in parallel. Mapping and reducing can be performed by using the MapReduce programming framework. A very large amount of data is transferred from Mapper to Reducer without any filtering or recursion, resulting in overdrawn bandwidth. In this paper, we introduce an algorithm called Inner MAPping Combiner (IMapC) for the map phase. This algorithm in the Mapper combines the values of recurring keys. In order to test the efficiency of the algorithm, different approaches were tested. According to the test, MapReduce programs that are implemented with the Default Combiner (DC) of IMapC will be 70% more efficient than those that are implemented without one. To make computations significantly faster, this work can be combined with MapReduce.
引用
收藏
页数:16
相关论文
共 23 条
  • [11] Aggregation on the fly: Reducing traffic for big data in the cloud
    University of Aizu, Japan
    不详
    [J]. IEEE Network, 5 (17-23): : 17 - 23
  • [12] Performance Improvement of MapReduce Process by Promoting Deep Data Locality
    Lee, Sungchul
    Joe, Ju-Yeon
    Kim, Yoohwan
    [J]. PROCEEDINGS OF 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, (DSAA 2016), 2016, : 292 - 301
  • [13] High-Performance Design of Hadoop RPC with RDMA over InfiniBand
    Lu, Xiaoyi
    Islam, Nusrat S.
    Wasi-ur-Rahman, Md
    Jose, Jithin
    Subramoni, Hari
    Wang, Hao
    Panda, Dhabaleswar K.
    [J]. 2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 641 - 650
  • [14] A Recommendation System Based on AI for Storing Block Data in the Electronic Health Repository
    Mani, Vinodhini
    Kavitha, C.
    Band, Shahab S.
    Mosavi, Amir
    Hollins, Paul
    Palanisamy, Selvashankar
    [J]. FRONTIERS IN PUBLIC HEALTH, 2022, 9
  • [15] Pinto V.F., 2017, P 2017 2 INT C COMPU, P1
  • [16] Privacy Preserving Multi-Party Key Exchange Protocol for Wireless Mesh Networks
    Roy, Amit Kumar
    Nath, Keshab
    Srivastava, Gautam
    Gadekallu, Thippa Reddy
    Lin, Jerry Chun-Wei
    [J]. SENSORS, 2022, 22 (05)
  • [17] Senthilkumar K., 2014, INT J INF ED TECHNOL, P159
  • [18] Vidhya S.R.S., 2021, P 2021 INT C SYSTEM
  • [19] Vinutha D.C., 2020, SN COMPUT SCI, V1, P98, DOI [10.1007/s42979-020-0089-6, DOI 10.1007/S42979-020-0089-6]
  • [20] Wang XF, 2012, 2011 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), VOLS 1-4, P2230