Cloud storage reliability for Big Data applications: A state of the art survey

被引:81
作者
Nachiappan, Rekha [1 ]
Javadi, Bahman [1 ]
Calheiros, Rodrigo N. [1 ]
Matawie, Kenan M. [2 ]
机构
[1] Western Sydney Univ, Sch Comp Engn & Math, Penrith, NSW, Australia
[2] Western Sydney Univ, Sch Comp Engn & Math, Stat, Penrith, NSW, Australia
关键词
Fault tolerance; Big Data applications; Cloud storage; Replication; Erasure coding; Data reliability; EXACT-REGENERATING CODES; DISTRIBUTED STORAGE; DATA REPLICATION; ERASURE CODES; REPAIR; CONSTRUCTION; SCHEME; AVAILABILITY; STRATEGY; FAILURE;
D O I
10.1016/j.jnca.2017.08.011
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud storage systems are now mature enough to handle a massive volume of heterogeneous and rapidly changing data, which is known as Big Data. However, failures are inevitable in cloud storage systems as they are composed of large scale hardware components. Improving fault tolerance in cloud storage systems for Big Data applications is a significant challenge. Replication and Erasure coding are the most important data reliability techniques employed in cloud storage systems. Both techniques have their own trade-off in various parameters such as durability, availability, storage overhead, network bandwidth and traffic, energy consumption and recovery performance. This survey explores the challenges involved in employing both techniques in cloud storage systems for Big Data applications with respect to the aforementioned parameters. In this paper, we also introduce a conceptual hybrid technique to further improve reliability, latency, bandwidth usage, and storage efficiency of Big Data applications on cloud computing.
引用
收藏
页码:35 / 47
页数:13
相关论文
共 83 条
[1]  
Agrawal Bikash, 2015, Cloud Computing and Big Data. Second International Conference, CloudCom-Asia 2015. Revised Selected Papers: LNCS 9106, P232, DOI 10.1007/978-3-319-28430-9_18
[2]  
[Anonymous], 2010, Proceedings of the 1st ACM symposium on Cloud computing
[3]  
[Anonymous], 2014, 11 USENIX S OPERATIN
[4]  
[Anonymous], 2010, OSDI
[5]  
[Anonymous], P HOTDEP
[6]  
Araujo Julio, 2011, Data Management in Grid and Peer-to-Peer Systems. Proceedings 4th International Conference (GLOBE 2011), P1, DOI 10.1007/978-3-642-22947-3_1
[7]  
AWS, 2016, SUMM AM DYNAMODB SER
[8]  
Baesens B., 2014, ANAL BIG DATA WORLD
[9]   EVENODD - AN EFFICIENT SCHEME FOR TOLERATING DOUBLE-DISK FAILURES IN RAID ARCHITECTURES [J].
BLAUM, M ;
BRADY, J ;
BRUCK, J ;
MENON, J .
IEEE TRANSACTIONS ON COMPUTERS, 1995, 44 (02) :192-202
[10]  
Bonvin N, 2009, FIRST WORKSHOP ON AUTOMATED CONTROL FOR DATACENTERS AND CLOUDS (ACDC '09), P49