Prefetching-based metadata management in Advanced Multitenant Hadoop

被引:8
作者
Minh Chau Nguyen [1 ]
Won, Heesun [1 ]
Son, Siwoon [2 ]
Gil, Myeong-Seon [2 ]
Moon, Yang-Sae [2 ]
机构
[1] ETRI, BigData Intelligence Res Dept, Daejeon, South Korea
[2] Kangwon Natl Univ, Dept Comp Sci, Chunchon, South Korea
关键词
Big data; Hadoop; Metadata management; Multitenancy; Prefetching; SERVICE; FRAMEWORK;
D O I
10.1007/s11227-017-2019-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Metadata management is an essential part in Apache Hadoop. Performing optimization of metadata accesses enhances big data storing, processing and analyzing, especially in multitenant environments. Nevertheless, as environmental complexity increases, metadata management is becoming more challenging and costly because of the heavy performance issues. In this paper, we propose a novel approach to improve the performance of metadata management for Hadoop in the multitenant environment based on the prefetching mechanism. We create metadata access graphs based on historical access values, define access patterns and then perform prefetching potential items for the near-future requests to minimize the latency. We present a formal algorithm to apply the prefetching mechanism into the Hadoop system and perform the actual implementation on a recent Hadoop system. Experimental results show that the proposed approach can enable the high performance for metadata management as well as maintain advanced multitenancy features.
引用
收藏
页码:533 / 553
页数:21
相关论文
共 28 条
[1]  
[Anonymous], 2015, Hadoop-The Definitive Guide: Storage and Analysis at Internet Scale
[2]  
[Anonymous], MariaDB
[3]   Exploring Vectorization and Prefetching Techniques on Scientific Kernels and Inferring the Cache Performance Metrics [J].
Banu, J. Saira ;
Babu, M. Rajasekhara .
INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2015, 7 (02) :18-36
[4]  
Bo Dong, 2010, Proceedings of the 2010 IEEE 2nd International Conference on Cloud Computing Technology and Science (CloudCom 2010), P41, DOI 10.1109/CloudCom.2010.60
[5]  
Bobrowski S, 2008, FORCE COM MULTITENAN
[6]   Software-Defined Networking for Scalable Cloud-based Services to Improve System Performance of Hadoop-based Big Data Applications [J].
Hagos, Desta Haileselassie .
INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2016, 8 (02) :1-22
[7]   A formal framework for prefetching based on the type-level access pattern in object-relational DBMSs [J].
Han, WS ;
Whang, KY ;
Moon, YS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (10) :1436-1448
[8]  
He H., 2015, J SUPERCOMPUT, V2015, P1
[9]   Enhancing throughput of the Hadoop Distributed File System for interaction-intensive tasks [J].
Hua, Xiayu ;
Wu, Hao ;
Li, Zheng ;
Ren, Shangping .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (08) :2770-2779
[10]   Integrated high sensitivity displacement sensor based on micro ring resonator [J].
Liu, Xin ;
Xue, Chenyang ;
Yan, Shubin ;
Xiong, Jijun ;
Zhang, Wendong .
2009 4TH IEEE INTERNATIONAL CONFERENCE ON NANO/MICRO ENGINEERED AND MOLECULAR SYSTEMS, VOLS 1 AND 2, 2009, :1000-1003