Distributed Index Mechanism based on Hadoop

被引:0
作者
Liu, Qin [1 ]
Zhang, Ni [1 ]
Yang, Xiaowen [1 ]
Zhu, Hongming [1 ]
机构
[1] Tongji Univ, Sch Software Engn, Shanghai, Peoples R China
来源
2014 14TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT) | 2014年
关键词
MapReduce; hadoop; schema; index;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recent years, MapReduce has aroused much attention. However, MapReduce has its own weakness-require an entire block scan as it cannot precisely locate the query result. Currently, there are already some researches that have built index on Hadoop, but some of them could only deal with full-text search, which cannot support dataset with certain schema. There's not yet a general distributed unstructured data index system optimized from MapReduce that could handle multi-schema dataset and support query well no matter with index or without index. So in this paper, we proposed a distributed index mechanism and set up this index mechanism on MapReduce which can reduce its query time and map task number in some context. Moreover, this distributed index mechanism could support multi-schema dataset, has a good scalability and is customizable. From our experiment, we find our distributed index mechanism could save up to 30% query time, and 90% map task number in some context compared to the query performance of original MapReduce framework, and the advantage grows as the dataset expands.
引用
收藏
页码:274 / 278
页数:5
相关论文
共 18 条
[1]  
Abouzeid Azza, 2009, P VLDB 09
[2]  
[Anonymous], 2012, 2012 IEEE C EV COMP, DOI DOI 10.1109/SURV.2012.010912.00123
[3]  
[Anonymous], ARXIV12123480
[4]  
[Anonymous], 2009, Proceedings of the VLDB Endowment
[5]  
[Anonymous], MAPREDUCE WORKSH
[6]   SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets [J].
Chaiken, Ronnie ;
Jenkins, Bob ;
Larson, Per-Ake ;
Ramsey, Bill ;
Shakib, Darren ;
Weaver, Simon ;
Zhou, Jingren .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02) :1265-1276
[7]  
Dittrich J., 2012, PVLDB, V5
[8]  
Dittrich J., 2010, PROC VLDB ENDOW, V3, P519
[9]   ON CERTAIN INTEGRALS OF LIPSCHITZ-HANKEL TYPE INVOLVING PRODUCTS OF BESSEL FUNCTIONS [J].
EASON, G ;
NOBLE, B ;
SNEDDON, IN .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY OF LONDON SERIES A-MATHEMATICAL AND PHYSICAL SCIENCES, 1955, 247 (935) :529-551
[10]   Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience [J].
Gates, Alan F. ;
Natkovich, Olga ;
Chopra, Shubham ;
Kamath, Pradeep ;
Narayanamurthy, Shravan M. ;
Olston, Christopher ;
Reed, Benjamin ;
Srinivasan, Santhosh ;
Srivastava, Utkarsh .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (02) :1414-1425