Parallel Data Processing in Dynamic Hybrid Computing Environment Using MapReduce

被引:0
作者
Tang, Bing [1 ]
He, Haiwu [2 ]
Fedak, Gilles [2 ]
机构
[1] Hunan Univ Sci & Technol, Sch Comp Sci & Engn, Xiangtan 411201, Peoples R China
[2] Univ Lyon, INRIA UCB Lyon 5668, UMR CNRS ENS Lyon, LIP Lab, Lyon, France
来源
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2014, PT II | 2014年 / 8631卷
关键词
Hybrid Computing Environment; Distributed File System; MapReduce; Volunteer Computing; Fault-tolerance;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A novel MapReduce computation model in hybrid computing environment called HybridMR is proposed in the paper. Using this model, high performance cluster nodes and heterogeneous desktop PCs in Internet or Intranet can be integrated to form a hybrid computing environment. In this way, the computation and storage capability of large-scale desktop PCs can be fully utilized to process large-scale datasets. HybridMR relies on a hybrid distributed file system called HybridDFS, and a time-out method has been used in HybridDFS to prevent volatility of desktop PCs, and file replication mechanism is used to realize reliable storage. A new node priority-based fair scheduling (NPBFS) algorithm has been developed in HybridMR to achieve both data storage balance and job assignment balance by assigning each node a priority through quantifying CPU speed, memory size and I/O bandwidth. Performance evaluation results show that the proposed hybrid computation model not only achieves reliable MapReduce computation, reduces task response time and improves the performance of MapReduce, but also reduces the computation cost and achieves a greener computing mode.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 13 条
[1]   BOINC: A system for public-resource computing and storage [J].
Anderson, DP .
FIFTH IEEE/ACM INTERNATIONAL WORKSHOP ON GRID COMPUTING, PROCEEDINGS, 2004, :4-10
[2]  
[Anonymous], 2008, 8 USENIX S OP SYST D
[3]  
Bing Tang, 2010, 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing (3PGCIC 2010), P193, DOI 10.1109/3PGCIC.2010.33
[4]   Computing on large-scale distributed systems:: XtremWeb architecture, programming models, security, tests and convergence with grid [J].
Cappello, F ;
Djilali, S ;
Fedak, G ;
Herault, T ;
Magniette, F ;
Néri, V ;
Lodygensky, O .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2005, 21 (03) :417-437
[5]   Internet-scale support for map-reduce processing [J].
Costa, Fernando ;
Veiga, Luis ;
Ferreira, Paulo .
JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2013, 4 :1-17
[6]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[7]   BitDew: A data management and distribution service with multi-protocol file transfer and metadata abstraction [J].
Fedak, Gilles ;
He, Haiwu ;
Cappello, Franck .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2009, 32 (05) :961-975
[8]   ADAPT: Availability-aware MapReduce Data Placement for Non-Dedicated Distributed Computing [J].
Jin, Hui ;
Yang, Xi ;
Sun, Xian-He ;
Raicu, Ioan .
2012 IEEE 32ND INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2012, :516-525
[9]  
Kyungyong Lee, 2012, 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CloudCom). Proceedings, P435, DOI 10.1109/CloudCom.2012.6427554
[10]   Reliable MapReduce computing on opportunistic resources [J].
Lin, Heshan ;
Ma, Xiaosong ;
Feng, Wu-chun .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2012, 15 (02) :145-161