A Cloud Computing Implementation of XML Indexing Method Using Hadoop

被引:0
作者
Hsu, Wen-Chiao [1 ]
Liao, I-En [1 ]
Shih, Hsiao-Chen [1 ]
机构
[1] Natl Chung Hsing Univ, Dept Comp Sci & Engn, Taichung 402, Taiwan
来源
INTELLIGENT INFORMATION AND DATABASE SYSTEMS (ACIIDS 2012), PT III | 2012年 / 7198卷
关键词
Hadoop; Cloud Computing; XML Indexing; XML query; Node Clustering Indexing Method;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the increasing of data at an incredible rate, the development of cloud computing technologies is of critical importance to the advances of researches. The Apache Hadoop has become a widely used open source cloud computing framework that provides a distributed file system for large scale data processing. In this paper, we present a cloud computing implementation of an XML indexing method called NCIM (Node Clustering Indexing Method), which was developed by our research team, for indexing and querying a large number of big XML documents using MapReduce. The experimental results show that NCIM is suitable for cloud computing environment. The throughput of 1200 queries per second for huge amount of queries using a 15-node cluster signifies the potential applications of NCIM to the fast query processing of enormous Internet documents.
引用
收藏
页码:256 / 265
页数:10
相关论文
共 14 条
[1]   Structural joins: A primitive for efficient XML query pattern matching [J].
Al-Khalifa, S ;
Jagadish, HV ;
Koudas, N ;
Patel, JM ;
Srivastava, D ;
Wu, YQ .
18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, :141-152
[2]  
[Anonymous], FILT TRANSF HIGH VOL
[3]  
Bruno N., 2002, P 2002 ACM SIGMOD IN, P310, DOI DOI 10.1145/564691.564727
[4]  
Chen S., 2006, P VERY LARGE DATA BA, P283
[5]  
Dutta H, 2011, GRID AND CLOUD DATABASE MANAGEMENT, P331, DOI 10.1007/978-3-642-20045-8_16
[6]  
Goldman R, 1997, PROCEEDINGS OF THE TWENTY-THIRD INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, P436
[7]  
Liao IE, 2010, COMM COM INF SC, V87, P70
[8]  
Lu W., 2006, 7 INT C GRID COMP, P28
[9]  
Pan Y., 2008, 22 IEEE INT PAR DIST
[10]  
PAN YF, 2007, 7 IEEE INT S CLUST C