Impacts of data consistency levels in cloud-based NoSQL for data-intensive applications

被引:0
|
作者
Ferreira, Saulo [1 ]
Mendonca, Julio [2 ]
Nogueira, Bruno [3 ]
Tiengo, Willy [3 ]
Andrade, Ermeson [1 ]
机构
[1] Univ Fed Rural Pernambuco, Recife, PE, Brazil
[2] Univ Luxembourg, Interdisciplinary Ctr Secur Reliabil & Trust SnT, Luxembourg, Luxembourg
[3] Univ Fed Alagoas, Maceio, Alagoas, Brazil
关键词
Cloud; Data consistency; Databases; NoSQL; Performance;
D O I
10.1186/s13677-024-00716-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
When using database management systems (DBMSs), it is common to distribute instance replicas across multiple locations for disaster recovery and scaling purposes. To efficiently geo-replicate data, it is crucial to ensure the data and its replicas remain consistent with the same and the most up-to-date data. However, DBMSs' inner characteristics and external factors, such as the replication strategy and network latency, can affect system performance when dealing with data replication, especially when the replicas are deployed far apart from the others. Thus, it is essential to comprehend how achieving high data consistency levels in geo-replicated systems can impact systems performance. This work analyzes various data consistency settings for the widely used NoSQL DBMSs, namely MongoDB, Redis, and Cassandra. The analysis is based on real-world experiments in which DBMS nodes are deployed on cloud platforms in different locations, considering single and multiple region deployments. Based on the results of the experiments, we provide a comprehensive analysis regarding the system throughput and response time when executing reading and writing operations, pointing out scenarios where each DBMS could be better employed. Some of our findings include, for instance, that opting for strong data consistency significantly impacts Cassandra's reading operations in the single-region deployment, while MongoDB writing operations are most affected in a multi-region scenario. Additionally, all of these DBMSs exhibit statistically significant variations across all scenarios in the multi-region setup when the data consistency is switched from weak to stronger level.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Challenges and Opportunities for Data-Intensive Computing in the Cloud
    Jung, Eun-Sung
    Kettimuthu, Rajkumar
    COMPUTER, 2014, 47 (12) : 82 - 85
  • [42] Running Data-Intensive Scientific Workflows in the Cloud
    Sato, Chiaki
    Leslie, Luke M.
    Lee, Young Choon
    Zomaya, Albert Y.
    Ranjan, Rajiv
    2014 15TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT 2014), 2014, : 180 - 185
  • [43] Static Analysis of Data-Intensive Applications
    Nagy, Csaba
    PROCEEDINGS OF THE 17TH EUROPEAN CONFERENCE ON SOFTWARE MAINTENANCE AND REENGINEERING (CSMR 2013), 2013, : 435 - 438
  • [44] The Impact of Data Locality on the Performance of a SaaS Cloud with Real-Time Data-Intensive Applications
    Stavrinides, Georgios L.
    Karatza, Helen D.
    2017 IEEE/ACM 21ST INTERNATIONAL SYMPOSIUM ON DISTRIBUTED SIMULATION AND REAL TIME APPLICATIONS (DS-RT), 2017, : 180 - 187
  • [45] Parallel data-intensive algorithms and applications
    Talia, D
    Srimani, PK
    PARALLEL COMPUTING, 2002, 28 (05) : 669 - 671
  • [46] Verification of Data-intensive Web Applications
    Gao, Ju
    Zeng, Hongwei
    Feng, Zhenhua
    ICMECG: 2009 INTERNATIONAL CONFERENCE ON MANAGEMENT OF E-COMMERCE AND E-GOVERNMENT, PROCEEDINGS, 2009, : 370 - 375
  • [47] CAPER 3.0: A Scalable Cloud-Based System for Data-Intensive Analysis of Chromosome-Centric Human Proteome Project Data Sets
    Yang, Shuai
    Zhang, Xinlei
    Diao, Lihong
    Guo, Feifei
    Wang, Dan
    Liu, Zhongyang
    Li, Honglei
    Zheng, Junjie
    Pan, Jingshan
    Nice, Edouard C.
    Li, Dong
    He, Fuchu
    JOURNAL OF PROTEOME RESEARCH, 2015, 14 (09) : 3720 - 3728
  • [48] A novel cloud model based data placement strategy for data-intensive application in clouds
    Zhang, Xinxin
    Hu, Zhigang
    Zheng, Meiguang
    Li, Jia
    Yang, Liu
    COMPUTERS & ELECTRICAL ENGINEERING, 2019, 77 : 445 - 456
  • [49] A Network Performance Based Data Placement Policy in Distributed Data-Intensive Applications
    Xu, Dawei
    Miao, Xianglin
    Hu, Peng
    Luan, Zhongzhi
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2014, : 795 - 800
  • [50] Possibilities and Challenges for Reconfigurable Hardware and Cloud Architectures in Data-Intensive Scientific Applications
    Bawatna, Mohammed
    Knodel, Oliver
    Spallek, Rainer G.
    2020 SEVENTH INTERNATIONAL CONFERENCE ON SOFTWARE DEFINED SYSTEMS (SDS), 2020, : 37 - 42