EventDB: A Large-Scale Semi-structured Scientific Data Management System

被引:0
|
作者
Zhao, Wenjia [1 ]
Qi, Yong [1 ]
Hou, Di [1 ]
Wang, Peijian [1 ]
Gao, Xin [1 ]
Du, Zirong [1 ]
Zhang, Yudong [1 ]
Zong, Yongfang [1 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian, Peoples R China
来源
BIG SCIENTIFIC DATA MANAGEMENT | 2019年 / 11473卷
基金
国家重点研发计划;
关键词
Scientific big data; Data storage; Data retrieval; HBase;
D O I
10.1007/978-3-030-28061-1_12
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
During the process of scientific research, the amount of data collected from scientific experimental devices has reached hundreds of PB per year. So how to use these data efficiently to produce some scientific findings is a hot problem. There are many challenges in the use of these scientific big data, such as the storage, processing and sharing of the data. In this paper, we propose a data management system, EventDB, for scientific big data. EventDB provides data management function for massive semi-structured scientific data; In EventDB, we propose IndexDB to provide a faster data retrieval, cross-domain access to provide a better data sharing and operator libraries to provide higher performance data analysis. Our preliminary experiments show that our system has improved performance by more than 6 times in data retrieval.
引用
收藏
页码:105 / 115
页数:11
相关论文
共 20 条
  • [1] Tool for extracting semi-structured data to a big data load
    Furtado, Joao Carlos
    Bulsing, Gabriel Merten
    Kroth, Eduardo
    Benitez Nara, Elpidio Oscar
    Kipper, Liane Malhmann
    REVISTA BRASILEIRA DE COMPUTACAO APLICADA, 2015, 7 (03): : 43 - 52
  • [2] Storing and Querying Semi-structured Spatio-Temporal Data in HBase
    Zhang, Chong
    Chen, Xiaoying
    Feng, Xiaosheng
    Ge, Bin
    WEB-AGE INFORMATION MANAGEMENT, 2016, 9998 : 303 - 314
  • [3] Benchmarking large-scale data management for Internet of Things
    Abdeltawab Hendawi
    Jayant Gupta
    Jiayi Liu
    Ankur Teredesai
    Naveen Ramakrishnan
    Mohak Shah
    Shaker El-Sappagh
    Kyung-Sup Kwak
    Mohamed Ali
    The Journal of Supercomputing, 2019, 75 : 8207 - 8230
  • [4] Benchmarking large-scale data management for Internet of Things
    Hendawi, Abdeltawab
    Gupta, Jayant
    Liu, Jiayi
    Teredesai, Ankur
    Ramakrishnan, Naveen
    Shah, Mohak
    El-Sappagh, Shaker
    Kwak, Kyung-Sup
    Ali, Mohamed
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (12): : 8207 - 8230
  • [5] Visualization and management of large-scale data on SX-6
    Kameyama, T
    Nakano, E
    Takei, T
    Yoshida, A
    Takahara, H
    NEC RESEARCH & DEVELOPMENT, 2003, 44 (01): : 95 - 98
  • [6] Optimizing data query performance of Bi-cluster for large-scale scientific data in supercomputers
    Xia Liao
    Yixian Shen
    Shengguo Li
    Yutong Lu
    Yufei Du
    Zhiguang Chen
    The Journal of Supercomputing, 2022, 78 : 2417 - 2441
  • [7] Optimizing data query performance of Bi-cluster for large-scale scientific data in supercomputers
    Liao, Xia
    Shen, Yixian
    Li, Shengguo
    Lu, Yutong
    Du, Yufei
    Chen, Zhiguang
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (02): : 2417 - 2441
  • [8] Hadoop-HBase for Large-Scale Data
    Vora, Mehul Nalin
    2011 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), VOLS 1-4, 2012, : 601 - 605
  • [9] Facility Information Management on HBase: Large-Scale Storage for Time-Series Data
    Ochiai, Hideya
    Ikegami, Hiroyuki
    Teranishi, Yuuichi
    Esaki, Hiroshi
    2014 38TH ANNUAL IEEE INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE WORKSHOPS (COMPSACW 2014), 2014, : 306 - 311
  • [10] Big Data Analytics for Large-scale Wireless Networks: Challenges and Opportunities
    Dai, Hong-Ning
    Wong, Raymond Chi-Wing
    Wang, Hao
    Zheng, Zibin
    Vasilakos, Athanasios V.
    ACM COMPUTING SURVEYS, 2019, 52 (05)