The Performance Analysis of Distributed Storage Systems Used in Scalable Web Systems

被引:0
|
作者
Oles, Dominik [1 ]
Nowak, Ziemowit [2 ]
机构
[1] Tieto Czech Sro, 28 Rijna 3346-91, Ostrava 70200, Czech Republic
[2] Wroclaw Univ Sci & Technol, Fac Comp Sci & Management, Wybrzeze Wyspianskiego 27, PL-50370 Wroclaw, Poland
来源
INFORMATION SYSTEMS ARCHITECTURE AND TECHNOLOGY, ISAT 2018, PT I | 2019年 / 852卷
关键词
Big Data; Hadoop; HBase; Kudu;
D O I
10.1007/978-3-319-99981-4_27
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Scalable web systems are directly related to distributed storage systems used to process large amounts of data (big data). An example of such a system is Hadoop with its many extensions supporting data storage such as SQL-on-Hadoop systems and the "Parquet" file format. Another kind of systems for storing and processing big data are NoSQL databases, such as HBase, which are used in applications requiring fast random access. The Kudu system was created to combine the advantages of Hadoop and HBase and enable both effective data set analysis and fast random access. As subject of the research, performance analysis of the mentioned systems was performed. The experiment was conducted in the Amazon Web Services public cloud environment, where the cluster of nine virtual machines was configured. For research purpose, containing about billion rows fragment of "Wikipedia Page Traffic Statistics" public dataset was used. The results of the measurements confirm that the Kudu system is a promising alternative to the commonly used technologies.
引用
收藏
页码:287 / 298
页数:12
相关论文
共 50 条
  • [31] Data Migration Algorithms in Heterogeneous Storage Systems: A Comparative Performance Evaluation
    Roberts, Gary
    Chen, Sixia
    Kari, Chadi
    Pallipuram, Vivek
    2017 IEEE 16TH INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2017, : 105 - 108
  • [32] Increasing Performance of Parallel and Distributed Systems in High Performance Computing using Weight based Approach
    Jothi, Arul
    Indumathy, P.
    2015 INTERNATIONAL CONFERENCED ON CIRCUITS, POWER AND COMPUTING TECHNOLOGIES (ICCPCT-2015), 2015,
  • [33] Efficient Prefetching and Client-Side Caching Algorithms for Improving the Performance of Read Operations in Distributed File Systems
    Nalajala, Anusha
    Ragunathan, Thirumalaisamy
    Naha, Ranesh
    IEEE ACCESS, 2022, 10 : 126232 - 126252
  • [34] An Adaptive MapReduce Scheduler for Scalable Heterogeneous Systems
    Ghoneem, Mohammad
    Kulkarni, Lalit
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 2, 2017, 469 : 603 - 611
  • [35] Scalable Spatial Queries in Big Data Systems
    Abdelhafeez, Laila
    2022 23RD IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2022), 2022, : 328 - 330
  • [36] Building the Monitoring Systems for Complex Distributed Systems: Problems and Solutions
    Korableva, Olga
    Kalimullina, Olga
    Kurbanova, Ekaterina
    ICEIS: PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS - VOL 2, 2017, : 221 - 228
  • [37] Performance Analysis of RDBMS and Hadoop Components with their File Formats for the development of Recommender Systems
    Gupta, Anchal
    Saxena, Merry
    Gill, Rupali
    2018 3RD INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2018,
  • [38] Security-Aware Efficient Mass Distributed Storage Approach for Cloud Systems in Big Data
    Gai, Keke
    Qiu, Meikang
    Zhao, Hui
    2016 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY), IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC), AND IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2016, : 140 - 145
  • [39] Distributed Big Data Analysis for Mobility Estimation in Intelligent Transportation Systems
    Fabbiani, Enzo
    Vidal, Pablo
    Massobrio, Renzo
    Nesmachnow, Sergio
    HIGH PERFORMANCE COMPUTING CARLA 2016, 2017, 697 : 146 - 160
  • [40] A NOVEL APPROACH FOR REPLICA SYNCHRONIZATION IN HADOOP DISTRIBUTED FILE SYSTEMS
    Vini, Miss. J.
    Nallathamby, Rachel
    Robin, C. R. Rene
    BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 590 - 595