Performance of Scalable Off-The-Shelf Hardware for Data-intensive Parallel Processing using MapReduce

被引:0
作者
Fadzil, Ahmad Firdaus Ahmad [1 ]
Khalid, Noor Elaiza Abdul [1 ]
Manaf, Mazani [1 ]
机构
[1] Univ Teknol MARA UiTM, Fac Comp & Math Sci, Shah Alam, Malaysia
来源
2012 7TH INTERNATIONAL CONFERENCE ON COMPUTING AND CONVERGENCE TECHNOLOGY (ICCCT2012) | 2012年
关键词
MapReduce; Parallel processing; Off-the-shelf hardware; scalability;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Large data and information processing requires high processing power that usually involve supercomputers which are costly. MapReduce parallel framework introduces an automated way of distributing these large processes to many computers. This paper proposes to conduct preliminary studies on scalability using MapReduce as an automated parallel processing running on low-cost off-the-shelf hardware. The system architecture is built with collections of off-the-shelf hardware. The scalability test will be conducted by adding an off-the-shelf hardware one at a time to the architecture. MapReduce tool is used as a parallel framework to automatically distribute tasks according to available resources. Performance will be evaluated based on improvement in speedup. It is found that MapReduce is able to accommodate scalability of off-the-shelf hardware resources by automatically distributing tasks regardless of the number of hardware being added to the architecture.
引用
收藏
页码:379 / 384
页数:6
相关论文
共 11 条
[1]   Accelerating Biomedical Data-Intensive Applications using MapReduce [J].
Han, Liangxiu ;
Ong, Hwee Yong .
2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, :49-57
[2]   TomusBlobs: scalable data-intensive processing on Azure clouds [J].
Costan, Alexandru ;
Tudoran, Radu ;
Antoniu, Gabriel ;
Brasche, Goetz .
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (04) :950-976
[3]   Rethinking Data-Intensive Science Using Scalable Analytics Systems [J].
Nothaft, Frank Austin ;
Massie, Matt ;
Danford, Timothy ;
Zhang, Zhao ;
Laserson, Uri ;
Yeksigian, Carl ;
Kottalam, Jey ;
Ahuja, Arun ;
Hammerbacher, Jeff ;
Linderman, Michael ;
Franklin, Michael J. ;
Joseph, Anthony D. ;
Patterson, David A. .
SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, :631-646
[4]   Analysis of Massive Industrial Data using MapReduce Framework for Parallel Processing [J].
Aly, Mohab ;
Yacout, Soumaya ;
Shaban, Yasser .
2017 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, 2017,
[5]   Parallel data intensive applications using MapReduce: a data mining case study in biomedical sciences [J].
Han, Liangxiu ;
Ong, Hwee Yong .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (01) :403-418
[6]   Parallel data intensive applications using MapReduce: a data mining case study in biomedical sciences [J].
Liangxiu Han ;
Hwee Yong Ong .
Cluster Computing, 2015, 18 :403-418
[7]   Parallel Data Processing in Dynamic Hybrid Computing Environment Using MapReduce [J].
Tang, Bing ;
He, Haiwu ;
Fedak, Gilles .
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2014, PT II, 2014, 8631 :1-14
[8]   IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduce [J].
Srivastava, Atul Kumar ;
Srivastava, Mitali .
SOFT COMPUTING, 2023, 27 (12) :7907-7923
[9]   IRPDP_HT2: a scalable data pre-processing method in web usage mining using Hadoop MapReduce [J].
Atul Kumar Srivastava ;
Mitali Srivastava .
Soft Computing, 2023, 27 :7907-7923
[10]   Multilevel Data Processing Using Parallel Algorithms for Analyzing Big Data in High-Performance Computing [J].
Ahmad, Awais ;
Paul, Anand ;
Din, Sadia ;
Rathore, M. Mazhar ;
Choi, Gyu Sang ;
Jeon, Gwanggil .
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2018, 46 (03) :508-527