Prefetching-based metadata management in Advanced Multitenant Hadoop

被引:0
|
作者
Minh Chau Nguyen
Heesun Won
Siwoon Son
Myeong-Seon Gil
Yang-Sae Moon
机构
[1] ETRI,BigData Intelligence Research Department
[2] Kangwon National University,Department of Computer Science
来源
The Journal of Supercomputing | 2019年 / 75卷
关键词
Big data; Hadoop; Metadata management; Multitenancy; Prefetching;
D O I
暂无
中图分类号
学科分类号
摘要
Metadata management is an essential part in Apache Hadoop. Performing optimization of metadata accesses enhances big data storing, processing and analyzing, especially in multitenant environments. Nevertheless, as environmental complexity increases, metadata management is becoming more challenging and costly because of the heavy performance issues. In this paper, we propose a novel approach to improve the performance of metadata management for Hadoop in the multitenant environment based on the prefetching mechanism. We create metadata access graphs based on historical access values, define access patterns and then perform prefetching potential items for the near-future requests to minimize the latency. We present a formal algorithm to apply the prefetching mechanism into the Hadoop system and perform the actual implementation on a recent Hadoop system. Experimental results show that the proposed approach can enable the high performance for metadata management as well as maintain advanced multitenancy features.
引用
收藏
页码:533 / 553
页数:20
相关论文
共 50 条
  • [21] QAOC: Novel query analysis and ontology-based clustering for data management in Hadoop
    Pradeep, D.
    Sundar, C.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 108 (108): : 849 - 860
  • [22] Standards-based metadata management for molecular simulations
    Grunzke, Richard
    Breuers, Sebastian
    Gesing, Sandra
    Herres-Pawlis, Sonja
    Kruse, Martin
    Blunk, Dirk
    de la Garza, Luis
    Packschies, Lars
    Schaefer, Patrick
    Schaerfe, Charlotta
    Schlemmer, Tobias
    Steinke, Thomas
    Schuller, Bernd
    Mueller-Pfefferkorn, Ralph
    Jaekel, Rene
    Nagel, Wolfgang E.
    Atkinson, Malcolm
    Krueger, Jens
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (10) : 1744 - 1759
  • [23] Design of Electric Power Data Management System Based on Hadoop
    Li, Yongheng
    Wang, Yongzhi
    Jin, Liang
    PROCEEDINGS OF THE 2016 4TH INTERNATIONAL CONFERENCE ON MACHINERY, MATERIALS AND INFORMATION TECHNOLOGY APPLICATIONS, 2016, 71 : 1090 - 1093
  • [24] Data Resource Management Platform of Paper-making Mill Equipment Operation based on Hadoop
    PANG Qian
    YU Zhongqing
    WANG Haiya
    International Journal of Plant Engineering and Management, 2019, 24 (01) : 44 - 51
  • [25] Electricity Production Data Processing and Management Based on Hadoop and Spark
    Wang, Jun
    Han, Lin-feng
    Hou, Bin
    INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION (ICCA 2016), 2016, : 177 - 181
  • [26] A Hadoop-based approach for efficient web service management
    Wang, Shangguang
    Su, Wei
    Zhu, Xilu
    Zhang, Hongke
    INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2013, 9 (01) : 18 - 34
  • [27] Design Guidelines and Process of Metadata Management Based on Data Management Body of Knowledge
    Khairunisak
    Kusumasari, Tien Fabrianti
    Fauzi, Rokhman
    2021 7TH INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT (ICIM 2021), 2021, : 87 - 91
  • [28] Replication-Based Query Management for Resource Allocation Using Hadoop and MapReduce over Big Data
    Kumar, Ankit
    Varshney, Neeraj
    Bhatiya, Surbhi
    Singh, Kamred Udham
    BIG DATA MINING AND ANALYTICS, 2023, 6 (04) : 465 - 477
  • [29] Hive-Based Anomaly Detection in Hadoop Log Data Management
    Son, Siwoon
    Gil, Myeong-Seon
    Yang, Seokwoo
    Moon, Yang-Sae
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2017, 421 : 837 - 842
  • [30] Hadoop-based index management scheme of power cloud data
    Zhuo, Ling
    Hu, Luo-na
    Wu, Bin
    Wu, Lie
    WIRELESS COMMUNICATION AND SENSOR NETWORK, 2016, : 924 - 933