A Multi-dimensional Index Structure Based on Improved VA-file and CAN in the Cloud

被引:13
作者
Cheng, Chun-Ling [1 ,2 ,3 ]
Sun, Chun-Ju [1 ]
Xu, Xiao-Long [1 ,2 ]
Zhang, Deng-Yin [3 ]
机构
[1] Nanjing Univ Posts & Telecommun, Coll Comp, Nanjing 210003, Jiangsu, Peoples R China
[2] Jiangsu High Technol Res Key Lab Wireless Sensor, Nanjing 210003, Jiangsu, Peoples R China
[3] Nanjing Univ Posts & Telecommun, Minist Educ Jiangsu Prov, Key Lab Broadband Wireless Commun & Sensor Networ, Nanjing 210003, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Cloud computing; index; similarity search; clustering; vector approximation file (VA-file); content addressable network (CAN);
D O I
10.1007/s11633-014-0772-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Currently, the cloud computing systems use simple key-value data processing, which cannot support similarity search effectively due to lack of efficient index structures, and with the increase of dimensionality, the existing tree-like index structures could lead to the problem of "the curse of dimensionality". In this paper, a novel VF-CAN indexing scheme is proposed. VF-CAN integrates content addressable network (CAN) based routing protocol and the improved vector approximation file (VA-file) index. There are two index levels in this scheme: global index and local index. The local index VAK-file is built for the data in each storage node. VAK-file is the k-means clustering result of VA-file approximation vectors according to their degree of proximity. Each cluster forms a separate local index file and each file stores the approximate vectors that are contained in the cluster. The vector of each cluster center is stored in the cluster center information file of corresponding storage node. In the global index, storage nodes are organized into an overlay network CAN, and in order to reduce the cost of calculation, only clustering information of local index is issued to the entire overlay network through the CAN interface. The experimental results show that VF-CAN reduces the index storage space and improves query performance effectively.
引用
收藏
页码:109 / 117
页数:9
相关论文
共 21 条
[1]  
Aguilera MK, 2008, PROC VLDB ENDOW, V1, P598
[2]  
Chen G, 2011, PROC VLDB ENDOW, V4, P702
[3]  
Chen Rang, 2009, Journal of Software, V20, P1337, DOI 10.3724/SP.J.1001.2009.03493
[4]  
DeCandia Giuseppe, 2007, Operating Systems Review, V41, P205, DOI 10.1145/1323293.1294281
[5]  
Ding LL, 2011, LECT NOTES COMPUT SC, V6897, P238, DOI 10.1007/978-3-642-23535-1_22
[6]  
Dong DA, 2005, THESIS
[7]  
Ghemawat S, 2003, ACM SIGOPS OPERATING, V37, P29, DOI [10.1145/1165389.945450, http://doi.acm.org/10.1145/1165389.945450, DOI 10.1145/945445.945450, 10.1145/945445.945450, DOI 10.1145/1165389.945450]
[8]   IC Cloud: Enabling Compositional Cloud [J].
Guo, Yi-Ke ;
Guo, Li .
INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2011, 8 (03) :269-279
[9]  
Huang Bin, 2011, 2011 IEEE 3rd International Conference on Communication Software and Networks (ICCSN 2011), P509, DOI 10.1109/ICCSN.2011.6014776
[10]   A method for trust management in cloud computing: Data coloring by cloud watermarking [J].
Liu Y.-C. ;
Ma Y.-T. ;
Zhang H.-S. ;
Li D.-Y. ;
Chen G.-S. .
International Journal of Automation and Computing, 2011, 8 (3) :280-285