A partitioning framework for Cassandra NoSQL database using Rendezvous hashing

被引:7
作者
Elghamrawy, Sally M. [1 ,3 ]
Hassanien, Aboul Ella [2 ,3 ]
机构
[1] MISR Higher Inst Engn & Technol, Mansoura, Egypt
[2] Cairo Univ, Fac Comp & Informat, Giza, Egypt
[3] SRGE, Giza, Egypt
关键词
Cassandra; Rendezvous hashing; Consistent hashing; MapReduce; NoSQL databases; Partitioning;
D O I
10.1007/s11227-017-2027-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the gradual expansion in data volume used in social networks and cloud computing, the term "Big data" has appeared with its challenges to store the immense datasets. Many tools and algorithms appeared to handle the challenges of storing big data. NoSQL databases, such as Cassandra and MongoDB, are designed with a novel data management system that can handle and process huge volumes of data. Partitioning data in NoSQL databases is considered one of the critical challenges in database design. In this paper, a MapReduce Rendezvous Hashing-Based Virtual Hierarchies (MR-RHVH) framework is proposed for scalable partitioning of Cassandra NoSQL database. The MapReduce framework is used to implement MR-RHVH on Cassandra to enhance its performance in highly distributed environments. MR-RHVH distributes the nodes to rendezvous regions based on a proposed Adopted Virtual Hierarchies strategy. Each region is responsible for a set of nodes. In addition, a proposed bloom filter evaluator is used to ensure the accurate allocation of keys to nodes in each region. Moreover, a number of experiments were performed to evaluate the performance of MR-RHVH framework, using YCSB for database benchmarking. The results show high scalability rate and less time consuming for MR-RHVH framework over different recent systems.
引用
收藏
页码:4444 / 4465
页数:22
相关论文
共 31 条
[1]   Testing Cloud Benchmark Scalability with Cassandra [J].
Abramova, Veronika ;
Bernardino, Jorge ;
Furtado, Pedro .
2014 IEEE WORLD CONGRESS ON SERVICES (SERVICES), 2014, :434-441
[2]   The Claremont Report on Database Research [J].
Agrawal, Rakesh ;
Ailamaki, Anastasia ;
Bernstein, Philip A. ;
Brewer, Eric A. ;
Carey, Michael J. ;
Chaudhuri, Surajit ;
Doan, AnHai ;
Florescu, Daniela ;
Franklin, Michael J. ;
Garcia-Molina, Hector ;
Gehrke, Johannes ;
Gruenwald, Le ;
Haas, Laura M. ;
Halevy, Alon Y. ;
Hellerstein, Joseph M. ;
Ioannidis, Yannis E. ;
Korth, Hank F. ;
Kossmann, Donald ;
Madden, Samuel ;
Magoulas, Roger ;
Ooi, Beng Chin ;
O'Reilly, Tim ;
Ramakrishnan, Raghu ;
Sarawagi, Sunita ;
Stonebraker, Michael ;
Szalay, Alexander S. ;
Weikum, Gerhard .
SIGMOD RECORD, 2008, 37 (03) :9-19
[3]   Handling big data: research challenges and future directions [J].
Anagnostopoulos, I. ;
Zeadally, S. ;
Exposito, E. .
JOURNAL OF SUPERCOMPUTING, 2016, 72 (04) :1494-1516
[4]  
Annual K, 2016, K ANN REPORT SEC FIL
[5]  
[Anonymous], 2010, P 1 ACM S CLOUD COMP, DOI DOI 10.1145/1807128.1807152
[6]   SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS [J].
BLOOM, BH .
COMMUNICATIONS OF THE ACM, 1970, 13 (07) :422-&
[7]  
Braam P.J., 2004, The Lustre Storage Architecture
[8]  
Bringer J, 2015, INT CONF BIOMETR, P527, DOI 10.1109/ICB.2015.7139069
[9]   Bigtable: A distributed storage system for structured data [J].
Chang, Fay ;
Dean, Jeffrey ;
Ghemawat, Sanjay ;
Hsieh, Wilson C. ;
Wallach, Deborah A. ;
Burrows, Mike ;
Chandra, Tushar ;
Fikes, Andrew ;
Gruber, Robert E. .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2008, 26 (02)
[10]   Hybrid Range Consistent Hash Partitioning Strategy-A New Data Partition Strategy for NoSQL Database [J].
Chen, Zhikun ;
Yang, Shuqiang ;
Tan, Shuang ;
Zhang, Ge ;
Yang, Huiyu .
2013 12TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2013), 2013, :1161-1169