Big high-dimension data cube designs for hybrid memory systems

被引:2
|
作者
Silva, Rodrigo Rocha [1 ]
Hirata, Celso Massaki [2 ]
Lima, Joubert de Castro [3 ]
机构
[1] Univ Coimbra, Fac Tecnol Sao Paulo, Rua Carlos Barattino,908 Vila Nova Mogilar, BR-08773600 Mogi Das Cruzes, SP, Brazil
[2] Inst Tecnol Aeronaut, Sao Jose Dos Campos, SP, Brazil
[3] Univ Fed Ouro Preto, Ouro Preto, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
Multidimensional database; Multidimensional query; Big Data; Data cube; Holistic measure; High dimension; COMPUTATION;
D O I
10.1007/s10115-020-01505-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In Big Data cubes with hundreds of dimensions and billions of tuples, the indexing and query operations are a challenge and the reason is the time-space exponential complexity when a full cube is computed. Therefore, solutions based on RAM may not be practical and the solutions based on hybrid memory (RAM and disk) become viable alternatives. In this paper, we propose a hybrid approach, named bCubing, to index and query high-dimension data cubes with high number of tuples in a single machine and using RAM and disk memory systems. We evaluated bCubing in terms of runtime and memory consumption, comparing it with the Frag-Cubing, HIC and H-Frag approaches. bCubing showed to be faster and used less RAM than Frag-Cubing, HIC and H-Frag. bCubing indexed and allowed to query a data cube with 1.2 billion tuples and 60 dimensions, consuming only 84 GB of RAM, which means 35% less memory than HIC. The complex holistic measures mode and median were computed in multidimensional queries, and bCubing was, on average, 50% faster than HIC.
引用
收藏
页码:4717 / 4746
页数:30
相关论文
共 30 条
  • [21] Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems
    Hye-Kyung Yang
    Hwan-Seung Yong
    JournalofDataandInformationScience, 2020, 5 (02) : 13 - 32
  • [22] Load Balancing Scheme for Supporting Real-time Processing of Big Data in Distributed In-Memory Systems
    Bok, Kyoungsoo
    Choi, Kitae
    Lim, Jongtae
    Yoo, Jaesoo
    PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 170 - 174
  • [23] Big Data Benchmarks of High-Performance Storage Systems on Commercial Bare Metal Clouds
    Lee, Hyungro
    Fox, Geoffrey C.
    2019 IEEE 12TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (IEEE CLOUD 2019), 2019, : 1 - 8
  • [24] Dynamic Adaptive Replacement Policy in Shared Last-Level Cache of DRAM/PCM Hybrid Memory for Big Data Storage
    Jia, Gangyong
    Han, Guangjie
    Jiang, Jinfang
    Liu, Li
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2017, 13 (04) : 1951 - 1960
  • [25] Data Locality in High Performance Computing, Big Data, and Converged Systems: An Analysis of the Cutting Edge and a Future System Architecture
    Usman, Sardar
    Mehmood, Rashid
    Katib, Iyad
    Albeshri, Aiiad
    ELECTRONICS, 2023, 12 (01)
  • [26] Rapid Transit Systems: Smarter Urban Planning Using Big Data, In-Memory Computing, Deep Learning, and GPUs
    Aqib, Muhammad
    Mehmood, Rashid
    Alzahrani, Ahmed
    Katib, Iyad
    Albeshri, Aiiad
    Altowaijri, Saleh M.
    SUSTAINABILITY, 2019, 11 (10)
  • [27] Big Data Velocity Management-From Stream to Warehouse via High Performance Memory Optimized Index Join
    Naeem, M. Asif
    Mirza, Farhaan
    Khan, Habib Ullah
    Sundaram, David
    Jamil, Noreen
    Weber, Gerald
    IEEE ACCESS, 2020, 8 : 195370 - 195384
  • [28] Novel Differential Schema for High Performance Big Data Telehealth Systems Using Pre-cache
    Zhao, Hui
    Gai, Keke
    Li, Jie
    He, Xin
    2015 IEEE 17TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2015 IEEE 7TH INTERNATIONAL SYMPOSIUM ON CYBERSPACE SAFETY AND SECURITY, AND 2015 IEEE 12TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (ICESS), 2015, : 1412 - 1417
  • [29] A view of High Dimensional, Large-Scale and Big Data Fuzzy Rule based Regression and Control Systems
    Angel Marquez, Antonio
    Alfredo Marquez, Francisco
    Maria Roldan, Ana
    Peregrin, Antonio
    2018 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2018,
  • [30] On Mixing High-Speed Updates and In-Memory Queries A Big-Data Architecture for Real-time Analytics
    Zhong, Tao
    Doshi, Kshitij A.
    Tang, Xi
    Lou, Ting
    Lu, Zhongyan
    Li, Hong
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,