HaoLap: A Hadoop based OLAP system for big data

被引:30
|
作者
Song, Jie [1 ]
Guo, Chaopeng [1 ]
Wang, Zhi [1 ]
Zhang, Yichan [1 ]
Yu, Ge [2 ]
Pierson, Jean-Marc [3 ]
机构
[1] Northeastern Univ, Software Coll, Shenyang 110819, Peoples R China
[2] Northeastern Univ, Sch Informat & Engn, Shenyang 110819, Peoples R China
[3] Univ Toulouse 3, Lab IRIT, F-31062 Toulouse, France
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
Cloud data warehouse; Multidimensional data model; MapReduce;
D O I
10.1016/j.jss.2014.09.024
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In recent years, facing information explosion, industry and academia have adopted distributed file system and MapReduce programming model to address new challenges the big data has brought. Based on these technologies, this paper presents HaoLap (Hadoop based oLap), an OLAP (OnLine Analytical Processing) system for big data. Drawing on the experience of Multidimensional OLAP (MOLAP), HaoLap adopts the specified multidimensional model to map the dimensions and the measures; the dimension coding and traverse algorithm to achieve the roll up operation on dimension hierarchy; the partition and linearization algorithm to store dimensions and measures; the chunk selection algorithm to optimize OLAP performance; and MapReduce to execute OLAP. The paper illustrates the key techniques of HaoLap including system architecture, dimension definition, dimension coding and traversing, partition, data storage, OLAP and data loading algorithm. We evaluated HaoLap on a real application and compared it with Hive, HadoopDB, HBaseLattice, and Olap4Cloud. The experiment results show that HaoLap boost the efficiency of data loading, and has a great advantage in the OLAP performance of the data set size and query complexity, and meanwhile HaoLap also completely support dimension operations. (C) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:167 / 181
页数:15
相关论文
共 50 条
  • [31] Minimizing Big Data Problems using Cloud Computing Based on Hadoop Architecture
    Adnan, Muhammad
    Afzal, Muhammad
    Aslam, Muhammad
    Jan, Roohl
    Martinez-Enriquez, A. M.
    2014 11TH ANNUAL HIGH CAPACITY OPTICAL NETWORKS AND EMERGING/ENABLING TECHNOLOGIES (PHOTONICS FOR ENERGY), 2014, : 99 - 103
  • [32] Big Data: Mining of Log File through Hadoop
    Kotiyal, Bina
    Kumar, Ankit
    Pant, Bhaskar
    Goudar, R. H.
    2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), 2013,
  • [33] The Role of Hadoop Technology in the Implementation of Big Data Concept
    Stupar, Savo
    Car, Mirha Bico
    Sahic, Elvir
    NEW TECHNOLOGIES, DEVELOPMENT AND APPLICATION, 2019, 42 : 254 - 261
  • [34] A Systematic Literature Review of Big Data and the Hadoop frameworks
    Naidu, Devishree
    Thakur, Adi
    INTERNATIONAL JOURNAL OF EARLY CHILDHOOD SPECIAL EDUCATION, 2022, 14 (02) : 2969 - 2973
  • [35] A Big Data Hadoop Architecture for Online Analysis.
    Lakavath, Suresh
    Naik, Ramlal L.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (11): : 58 - 62
  • [36] Monitoring big process data of industrial plants with multiple operating modes based on Hadoop
    Zhu, Jinlin
    Yao, Yuan
    Li, Dewei
    Gao, Furong
    JOURNAL OF THE TAIWAN INSTITUTE OF CHEMICAL ENGINEERS, 2018, 91 : 10 - 21
  • [37] Big Data Processing Using Hadoop and Spark: The Case of Meteorology Data
    Hussein, Eslam
    Sadiki, Ronewa
    Jafta, Yahlieel
    Sungay, Muhammad Mujahid
    Ajayi, Olasupo
    Bagula, Antoine
    E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 180 - 185
  • [38] Design of Electric Power Data Management System Based on Hadoop
    Li, Yongheng
    Wang, Yongzhi
    Jin, Liang
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND INFORMATION TECHNOLOGY APPLICATIONS, 2016, 71 : 1090 - 1093
  • [39] Application of HADOOP to Store and Process Big Data Gathered from an Urban Water Distribution System
    Jach, Tomasz
    Magiera, Ewa
    Froelich, Wojciech
    COMPUTING AND CONTROL FOR THE WATER INDUSTRY (CCWI2015): SHARING THE BEST PRACTICE IN WATER MANAGEMENT, 2015, 119 : 1375 - 1380
  • [40] Software-Defined Networking for Scalable Cloud-based Services to Improve System Performance of Hadoop-based Big Data Applications
    Hagos, Desta Haileselassie
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2016, 8 (02) : 1 - 22