HaoLap: A Hadoop based OLAP system for big data

被引:30
|
作者
Song, Jie [1 ]
Guo, Chaopeng [1 ]
Wang, Zhi [1 ]
Zhang, Yichan [1 ]
Yu, Ge [2 ]
Pierson, Jean-Marc [3 ]
机构
[1] Northeastern Univ, Software Coll, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Sch Informat & Engn, Shenyang 110819, Peoples R China
[3] Univ Toulouse 3, Lab IRIT, F-31062 Toulouse, France
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
Cloud data warehouse; Multidimensional data model; MapReduce;
D O I
10.1016/j.jss.2014.09.024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, facing information explosion, industry and academia have adopted distributed file system and MapReduce programming model to address new challenges the big data has brought. Based on these technologies, this paper presents HaoLap (Hadoop based oLap), an OLAP (OnLine Analytical Processing) system for big data. Drawing on the experience of Multidimensional OLAP (MOLAP), HaoLap adopts the specified multidimensional model to map the dimensions and the measures; the dimension coding and traverse algorithm to achieve the roll up operation on dimension hierarchy; the partition and linearization algorithm to store dimensions and measures; the chunk selection algorithm to optimize OLAP performance; and MapReduce to execute OLAP. The paper illustrates the key techniques of HaoLap including system architecture, dimension definition, dimension coding and traversing, partition, data storage, OLAP and data loading algorithm. We evaluated HaoLap on a real application and compared it with Hive, HadoopDB, HBaseLattice, and Olap4Cloud. The experiment results show that HaoLap boost the efficiency of data loading, and has a great advantage in the OLAP performance of the data set size and query complexity, and meanwhile HaoLap also completely support dimension operations. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:167 / 181
页数:15
相关论文
共 50 条
  • [21] Research and Practice of Big Data Analysis Process Based on Hadoop Framework
    Jiang, Hui
    PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 2044 - 2047
  • [22] A Hadoop based Framework to Process Geo-distributed Big Data
    Cavallo, Marco
    Cusma', Lorenzo
    Di Modica, Giuseppe
    Polito, Carmelo
    Tomarchio, Orazio
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND SERVICES SCIENCE, VOL 1 (CLOSER), 2016, : 178 - 185
  • [23] BIG-BIO: - Big Data Hadoop-based Analytic Cluster Framework for Bioinformatics
    Abul Seoud, Rania Ahmed Abdel Azeem
    Mahmoud, Mahmoud Ahmed
    Ramadan, Amr Essam Eldin
    2017 INTERNATIONAL CONFERENCE ON INFORMATICS, HEALTH & TECHNOLOGY (ICIHT), 2017,
  • [24] Analysis of Big Data Platform with OpenStack and Hadoop
    Li, Xiaoyan
    Lu, Zhihui
    Wang, Nini
    Wu, Jie
    Huang, Shalin
    ADVANCES IN SERVICES COMPUTING, 2016, 10065 : 375 - 390
  • [25] Big Data Analysis using Apache Hadoop
    Manikandan, Shankar Ganesh
    Ravi, Siddarth
    2014 INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS), 2014,
  • [26] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [27] Moving Hadoop to the Cloud for Big Data Analytics
    Astrova, Irina
    Koschel, Arne
    Heine, Felix
    Kalja, Ahto
    DATABASES AND INFORMATION SYSTEMS X (DB&IS 2018), 2019, 315 : 195 - 209
  • [28] A Big Data Framework for Mining Sensor Data Using Hadoop
    El-Shafeiy, Engy A.
    El-Desouky, Ali I.
    STUDIES IN INFORMATICS AND CONTROL, 2017, 26 (03): : 365 - 376
  • [29] EMM: Extended matching market based scheduling for big data platform hadoop
    Singh, Balraj
    Verma, Harsh K.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (24) : 34823 - 34847
  • [30] Storage and Processing System of Meter Data Based on Hadoop
    Liu, Sai
    Guo, Jian
    Feng, Yuan
    2017 2ND INTERNATIONAL CONFERENCE ON COMPUTATIONAL MODELING, SIMULATION AND APPLIED MATHEMATICS (CMSAM), 2017, : 453 - 457