Hadoop Dataset for Job Estimation in the Cloud with Limited Bandwidth

被引:1
作者
Bergui, Mohammed [1 ]
Nikolov, Nikola S. [2 ]
Najah, Said [1 ]
机构
[1] Univ Sidi Mohammed Ben Abdellah, Fac Sci & Technol, Dept Comp Sci, Lab Intelligent Syst & Applicat, Fes, Morocco
[2] Univ Limerick, Dept Comp Sci & Informat Syst, Limerick, Ireland
来源
ADVANCES IN INFORMATION AND COMMUNICATION, FICC, VOL 2 | 2023年 / 652卷
关键词
Hadoop; MapReduce; Cloud computing; Bandwidth; Estimating the runtime;
D O I
10.1007/978-3-031-28073-3_24
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Hadoop MapReduce is a well-known open source framework for processing a large amount of data in a cluster of machines; it has been adopted by many organizations and deployed on-premise and on the cloud. MapReduce job execution time estimation and prediction are crucial for efficient scheduling, resource management, better energy consumption, and cost saving. In this paper, we present our new dataset of MapReduce job traces in a cloud environment with limited network bandwidth; we describe the process of generating and collecting the dataset in this paper. We believe that this dataset will help researchers develop new scheduling approaches and improve Hadoop MapReduce job performance.
引用
收藏
页码:341 / 348
页数:8
相关论文
共 15 条
  • [1] Alapati S.R., 2016, Expert Hadoop Administration: Managing, Tuning, and Securing Spark, YARN
  • [2] [Anonymous], APACHE HADOOP MAPRED
  • [3] [Anonymous], DATAPROC GOOGLE CLOU
  • [4] [Anonymous], DATAPROC IMAGE VERSI
  • [5] [Anonymous], TPCX BB EXPRESS BIG
  • [6] [Anonymous], APACHE HADOOP 2 10 1
  • [7] Benchmarking and Performance Modelling of MapReduce Communication Pattern
    Ceesay, Sheriffo
    Barker, Adam
    Lin, Yuhui
    [J]. 11TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM 2019), 2019, : 127 - 134
  • [8] Big data clustering with varied density based on MapReduce
    Heidari, Safanaz
    Alborzi, Mahmood
    Radfar, Reza
    Afsharkazemi, Mohammad Ali
    Ghatari, Ali Rajabzadeh
    [J]. JOURNAL OF BIG DATA, 2019, 6 (01)
  • [9] Kadirvel S, 2012, IEEE IC COMP COM NET
  • [10] Hadoop Performance Modeling for Job Estimation and Resource Provisioning
    Khan, Mukhtaj
    Jin, Yong
    Li, Maozhen
    Xiang, Yang
    Jiang, Changjun
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (02) : 441 - 454