Experimental Characteristics Study of Data Storage Formats for Data Marts Development within Data Lakes

被引:3
作者
Belov, Vladimir [1 ]
Kosenkov, Alexander N. [2 ]
Nikulchev, Evgeny [1 ]
机构
[1] MIREA Russian Technol Univ, Dept Intelligent Informat Secur Syst, Moscow 119454, Russia
[2] Sechenov First Moscow State Med Univ, Dept Hosp Surg, Moscow 119435, Russia
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 18期
关键词
big data; data lakes; data storage formats; data marts; BIG DATA;
D O I
10.3390/app11188651
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
One of the most popular methods for building analytical platforms involves the use of the concept of data lakes. A data lake is a storage system in which the data are presented in their original format, making it difficult to conduct analytics or present aggregated data. To solve this issue, data marts are used, representing environments of stored data of highly specialized information, focused on the requests of employees of a certain department, the vector of an organization's work. This article presents a study of big data storage formats in the Apache Hadoop platform when used to build data marts.
引用
收藏
页数:9
相关论文
共 31 条
[1]  
Alasta A.F., 2014, INT J COMPUT COMMUN, V1, P48
[2]  
Ali W., 2019, Asian Journal of Research in Computer Science, V4, P1, DOI [10.9734/AJRCOS/2019/v4i230108, DOI 10.9734/AJRCOS/2019/V4I230108]
[3]  
Apache, AVR SPEC 2012
[4]  
Apache, HIV OFF DOC 2014
[5]  
Apache, PARQ OFF DOC 2018
[6]   Choosing a Data Storage Format in the Apache Hadoop System Based on Experimental Evaluation Using Apache Spark [J].
Belov, Vladimir ;
Tatarintsev, Andrey ;
Nikulchev, Evgeny .
SYMMETRY-BASEL, 2021, 13 (02) :1-20
[7]   Towards NoSQL-based Data Warehouse Solutions [J].
Bicevska, Zane ;
Oditis, Ivo .
ICTE 2016, 2017, 104 :104-111
[8]   Designing data marts for data warehouses [J].
Bonifati, A ;
Cattaneo, F ;
Ceri, S ;
Fuggetta, A ;
Paraboschi, S .
ACM TRANSACTIONS ON SOFTWARE ENGINEERING AND METHODOLOGY, 2001, 10 (04) :452-483
[9]   Big Data for Creating and Capturing Value in the Digitalized Environment: Unpacking the Effects of Volume, Variety, and Veracity on Firm Performance* [J].
Cappa, Francesco ;
Oriani, Raffaele ;
Peruffo, Enzo ;
McCarthy, Ian .
JOURNAL OF PRODUCT INNOVATION MANAGEMENT, 2021, 38 (01) :49-67
[10]   Big data analytics: a literature review [J].
Chong, Dazhi ;
Shi, Hui .
JOURNAL OF MANAGEMENT ANALYTICS, 2015, 2 (03) :175-201