Efficient Data Management Tools for the Heterogeneous Big Data Warehouse

被引:3
|
作者
Alekseev, A. A. [1 ]
Osipova, V. V. [1 ]
Ivanov, M. A. [1 ]
Klimentov, A. [2 ]
Grigorieva, N. V. [1 ]
Nalamwar, H. S. [1 ]
机构
[1] Natl Res Tomsk Polytech Univ, Tomsk, Russia
[2] Brookhaven Natl Lab, Upton, NY 11973 USA
关键词
Relational Database Management System (RDBMS); Non-relational Structure Query Language (NoSQL); Structure Query Language (SQL); Big Data; Heterogeneous Data Warehouse; Apache Hadoop; Hive; MongoDB; Data Manipulation Language (DML) Operations;
D O I
10.1134/S1547477116050022
中图分类号
O412 [相对论、场论]; O572.2 [粒子物理学];
学科分类号
摘要
The traditional RDBMS has been consistent for the normalized data structures. RDBMS served well for decades, but the technology is not optimal for data processing and analysis in data intensive fields like social networks, oil-gas industry, experiments at the Large Hadron Collider, etc. Several challenges have been raised recently on the scalability of data warehouse like workload against the transactional schema, in particular for the analysis of archived data or the aggregation of data for summary and accounting purposes. The paper evaluates new database technologies like HBase, Cassandra, and MongoDB commonly referred as NoSQL databases for handling messy, varied and large amount of data. The evaluation depends upon the performance, throughput and scalability of the above technologies for several scientific and industrial use-cases. This paper outlines the technologies and architectures needed for processing Big Data, as well as the description of the back-end application that implements data migration from RDBMS to NoSQL data warehouse, NoSQL database organization and how it could be useful for further data analytics.
引用
收藏
页码:689 / 692
页数:4
相关论文
共 50 条
  • [21] Data Compression in Big Graph Warehouse
    Polyakov I.V.
    Chepovskiy A.A.
    Chepovskiy A.M.
    Journal of Mathematical Sciences, 2020, 245 (2) : 197 - 201
  • [22] Managing Evolution of Heterogeneous Data Sources of a Data Warehouse
    Solodovnikova, Darja
    Niedrite, Laila
    Svilpe, Lauma
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS 2021), VOL 1, 2021, : 105 - 117
  • [23] The role of tools in development of a data warehouse
    McCabe, MC
    Grossman, D
    PROCEEDINGS OF THE FOURTH INTERNATIONAL SYMPOSIUM ON ASSESSMENT OF SOFTWARE TOOLS, 1996, : 139 - 145
  • [24] Hybrid Data Warehouse Model for Climate Big Data Analysis
    Doreswamy
    Gad, Ibrahim
    Manjunatha, B. R.
    PROCEEDINGS OF 2017 IEEE INTERNATIONAL CONFERENCE ON CIRCUIT ,POWER AND COMPUTING TECHNOLOGIES (ICCPCT), 2017,
  • [25] Big Data Warehouse for Healthcare-Sensitive Data Applications
    Shahid, Arsalan
    Nguyen, Thien-An Ngoc
    Kechadi, M-Tahar
    SENSORS, 2021, 21 (07)
  • [26] Atrak: a MapReduce-based data warehouse for big data
    Barkhordari, Mohammadhossein
    Niamanesh, Mahdi
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
  • [27] Evaluation of Data Warehouse Design Methodologies in the Context of Big Data
    Di Tria, Francesco
    Lefons, Ezio
    Tangorra, Filippo
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2017, 2017, 10440 : 3 - 18
  • [28] Atrak: a MapReduce-based data warehouse for big data
    Mohammadhossein Barkhordari
    Mahdi Niamanesh
    The Journal of Supercomputing, 2017, 73 : 4596 - 4610
  • [29] Bibliography, catalogs, pixel data: Management of heterogeneous Big Data at CDS by the documentalists
    Buga, M.
    Fernique, P.
    Bot, C.
    Allen, M. G.
    Bonnarel, F.
    Brouty, M.
    LIBRARY AND INFORMATION SERVICES IN ASTRONOMY VIII: ASTRONOMY LIBRARIANSHIP IN THE ERA OF BIG DATA AND OPEN SCIENCE, 2018, 186
  • [30] Scalable and Hierarchical Distributed Data Structures for Efficient Big Data Management
    Sioutas, Spyros
    Vonitsanos, Gerasimos
    Zacharatos, Nikolaos
    Zaroliagis, Christos
    ALGORITHMIC ASPECTS OF CLOUD COMPUTING (ALGOCLOUD 2019), 2020, 12041 : 122 - 160