An Advanced MapReduce: Cloud MapReduce, Enhancements and Applications

被引:30
作者
Dahiphale, Devendra [1 ]
Karve, Rutvik [1 ]
Vasilakos, Athanasios V. [2 ]
Liu, Huan [3 ]
Yu, Zhiwei [4 ]
Chhajer, Amit [1 ]
Wang, Jianmin [4 ]
Wang, Chaokun [4 ]
机构
[1] Pune Inst Comp Technol, Pune, Maharashtra, India
[2] Kuwait Univ, Kuwait, Kuwait
[3] Jamo, Hyderabad, Andhra Pradesh, India
[4] Tsinghua Univ, Sch Software, Beijing 100084, Peoples R China
来源
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT | 2014年 / 11卷 / 01期
关键词
Cloud computing; MapReduce; pipelining; stream processing; spot market;
D O I
10.1109/TNSM.2014.031714.130407
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, Cloud Computing is attracting great attention due to its provision of configurable computing resources. MapReduce (MR) is a popular framework for data-intensive distributed computing of batch jobs. MapReduce suffers from the following drawbacks: 1. It is sequential in its processing of Map and Reduce Phases 2. Being cluster based, its scalability is relatively limited. 3. It does not support flexible pricing. 4. It does not support stream data processing. We describe Cloud MapReduce (CMR), which overcomes these limitations. Our results show that CMR is more efficient and runs faster than other implementations of the MR framework. In addition to this, we showcase how CMR can be further enhanced to: 1. Support stream data processing in addition to batch data by parallelizing the Map and Reduce phases through a pipelining model. 2. Support flexible pricing using Amazon Cloud's spot instances and to deal with massive machine terminations caused by spot price fluctuations. 3. Improve throughput and speed-up processing over traditional MR by more than 30% for large data sets. 4. Provide added flexibility and scalability by leveraging features of the cloud computing model. Click-stream analysis, real-time multimedia processing, time-sensitive analysis and other stream processing applications can also be supported.
引用
收藏
页码:101 / 115
页数:15
相关论文
共 18 条
[1]  
Al-Fares M., P 2008 SIGCOMM, P63
[2]  
[Anonymous], 2010, Proceedings of the 2nd USENIX conference on Hot topics in cloud computing, HotCloud'10
[3]  
Chen YP, 2009, WREN 2009, P73
[4]  
Cytron R., 2000, DYNAMO, P75
[5]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[6]  
Guo C., P 2008 SIGCOMM, P75
[7]  
Guo CX, 2009, SIGCOMM 2009, P63
[8]  
Hellerstein J. M., 2010, NSDI, P313
[9]   Optimizing Cloud MapReduce for Processing Stream Data using Pipelining [J].
Karve, Rutvik ;
Dahiphale, Devendra ;
Chhajer, Amit .
UKSIM FIFTH EUROPEAN MODELLING SYMPOSIUM ON COMPUTER MODELLING AND SIMULATION (EMS 2011), 2011, :344-349
[10]  
Liu H., P 2011 IEEE INT S CL, P464