Performance Evaluation of Data-Intensive Computing Applications on a Public IaaS Cloud

被引:3
作者
Exposito, Roberto R. [1 ]
Taboada, Guillermo L. [1 ]
Ramos, Sabela [1 ]
Tourino, Juan [1 ]
Doallo, Ramon [1 ]
机构
[1] Univ A Coruna, Dept Elect & Syst, Comp Architecture Grp, Campus Elvina S-N, La Coruna 15071, Spain
关键词
data-intensive computing; cloud computing; Infrastructure as a Service; Amazon EC2; cluster file system; MapReduce; MAPREDUCE;
D O I
10.1093/comjnl/bxu111
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The advent of cloud computing technologies, which dynamically provide on-demand access to computational resources over the Internet, is offering new possibilities to many scientists and researchers. Nowadays, Infrastructure as a Service (IaaS) cloud providers can offset the increasing processing requirements of data-intensive computing applications, becoming an emerging alternative to traditional servers and clusters. In this paper, a comprehensive study of the leading public IaaS cloud platform, Amazon EC2, has been conducted in order to assess its suitability for data-intensive computing. One of the key contributions of this work is the analysis of the storage-optimized family of EC2 instances. Furthermore, this study presents a detailed analysis of both performance and cost metrics. More specifically, multiple experiments have been carried out to analyze the full I/O software stack, ranging from the low-level storage devices and cluster file systems up to real-world applications using representative data-intensive parallel codes and MapReduce-based workloads. The analysis of the experimental results has shown that data-intensive applications can benefit from tailored EC2-based virtual clusters, enabling users to obtain the highest performance and cost-effectiveness in the cloud.
引用
收藏
页码:287 / 307
页数:21
相关论文
共 69 条
[21]  
[Anonymous], P 21 ACM IEEE SUP C
[22]  
[Anonymous], 2010, Proceedings o f the 19th ACM International Symposium on High Performance Distributed Computing, DOI DOI 10.1145/1851476.1851535
[23]   Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility [J].
Buyya, Rajkumar ;
Yeo, Chee Shin ;
Venugopal, Srikumar ;
Broberg, James ;
Brandic, Ivona .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (06) :599-616
[24]  
Carns PH, 2000, USENIX ASSOCIATION PROCEEDINGS OF THE 4TH ANNUAL LINUX SHOWCASE AND CONFERENCE, ATLANTA, P317
[25]  
Carns P, 2011, IEEE S MASS STOR SYS
[26]  
Chattopadhyay B, 2011, PROC VLDB ENDOW, V4, P1318
[27]   Noncontiguous I/O accesses through MPI-IO [J].
Ching, A ;
Choudhary, A ;
Coloma, K ;
Liao, WK ;
Ross, R ;
Gropp, W .
CCGRID 2003: 3RD IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS, 2003, :104-111
[28]  
Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137
[29]  
Dimitrov Martin, 2013, 2013 IEEE International Conference on Big Data, P15, DOI 10.1109/BigData.2013.6691693
[30]   A survey of large-scale analytical query processing in MapReduce [J].
Doulkeridis, Christos ;
Norvag, Kjetil .
VLDB JOURNAL, 2014, 23 (03) :355-380