Efficient Metric All-k-Nearest-Neighbor Search on Datasets Without Any Index

被引:0
作者
Hai-Da Zhang
Zhi-Hao Xing
Lu Chen
Yun-Jun Gao
机构
[1] Zhejiang University,College of Computer Science
[2] The University of New South Wales,School of Computer Science and Engineering
来源
Journal of Computer Science and Technology | 2016年 / 31卷
关键词
all-k-nearest-neighbor search; query processing; metric space;
D O I
暂无
中图分类号
学科分类号
摘要
An all-k-nearest-neighbor (AkNN) query finds k nearest neighbors for each query object. This problem arises naturally in many areas, such as GIS (geographic information system), multimedia retrieval, and recommender systems. To support various data types and flexible distance metrics involved in real applications, we study AkNN retrieval in metric spaces, namely, metric AkNN (MAkNN) search. Consider that the underlying indexes on the query set and the object set may not exist, which is natural in many scenarios. For example, the query set and the object set could be the results of other queries, and thus, the underlying indexes cannot be built in advance. To support MAkNN search on datasets without any underlying index, we propose an efficient disk-based algorithm, termed as Partition-Based MAkNN Algorithm (PMA), which follows a partition-search framework and employs a series of pruning rules for accelerating the search. In addition, we extend our techniques to tackle an interesting variant of MAkNN queries, i.e., metric self-AkNN (MSAkNN) search, where the query set is identical to the object set. Extensive experiments using both real and synthetic datasets demonstrate the effectiveness of our pruning rules and the efficiency of the proposed algorithms, compared with state-of-the-art MAkNN and MSAkNN algorithms.
引用
收藏
页码:1194 / 1211
页数:17
相关论文
共 51 条
[1]  
Chen HL(2011)All-nearest-neighbors finding based on the Hilbert curve Expert Systems with Applications 38 7462-7475
[2]  
Chang YI(2004)The k-nearest neighbour join: Turbo charging the KDD process Knowledge and Information Systems 6 728-749
[3]  
Böhm C(2012)Approximate all nearest neighbor search for high dimensional entropy estimation for image registration Signal Processing 92 1302-1316
[4]  
Krebs F(2012)Efficient processing of k nearest neighbor joins using MapReduce Proceedings of the PVLDB Endowment 5 1016-1027
[5]  
Kybic J(2007)A fast all nearest neighbor algorithm for applications involving large point-clouds Computers & Graphics 31 157-174
[6]  
Vnučko I(1997)An optimal algorithm for the anglerestricted all nearest neighbor problem on the reconfigurable mesh, with applications IEEE Transactions on Parallel and Distributed Systems 8 983-990
[7]  
Lu W(2005)Efficient algorithms for the all nearest neighbor and closest pair problems on the linear array with a reconfigurable pipelined bus system IEEE Transactions on Parallel and Distributed Systems 16 193-206
[8]  
Shen YY(2007)Efficient index-based KNN join processing for high-dimensional data Information and Software Technology 49 332-344
[9]  
Chen S(2010)High-dimensional KNN joins with incremental updates Geoinformatica 14 55-82
[10]  
Ooi BC(1999)Nearest neighbor queries in metric spaces Discrete & Computational Geometry 22 63-93