Quicker range- and k-NN joins in metric spaces

被引:4
|
作者
Fredriksson, Kimmo [1 ]
Braithwaite, Billy [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Kuopio 70211, Finland
关键词
Metric spaces; Similarity joins; k-nearest neighbors; Range searching; QUERIES; SEARCH; INDEX;
D O I
10.1016/j.is.2014.09.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the join operation in metric spaces. Given two sets A and B of objects drawn from some universe U, we want to compute the set A x B = {(a, b) is an element of A (sic) B vertical bar d(a, b) <= r) efficiently, where d: U x U -> R+ is a metric distance function and r is an element of R+ is user supplied query radius. In particular we are interested in the case where we have no index available (nor we can afford to build it) for either A or B. In this paper we improve the Quickjoin algorithm [6], inspired by the well-known Quicksort algorithm, by (i) replacing the low level component that handles small subsets with essentially brute-force nested loop with a more efficient method; (ii) showing that, contrary to Quicksort, in Quickjoin unbalanced partitioning can improve the algorithm; (iii) making the algorithm probabilistic while still obtaining most of the relevant results; and (iv) improving the (worst case) work space needed, from O(n(2)) to O(logn), where n is the size of the dataset(s). We also discuss pivot selection, show how to adapt Quickjoin to compute k-nearest neighbor joins and sketch a parallel version of the algorithm. The experimental results show that the method works well in practice. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:189 / 204
页数:16
相关论文
共 50 条
  • [21] Pivot-based approximate k-NN similarity joins for big high-dimensional data
    Cech, Premysl
    Lokoc, Jakub
    Silva, Yasin N.
    INFORMATION SYSTEMS, 2020, 87
  • [22] Secure and Efficient k-NN Queries
    Asif, Hafiz
    Vaidya, Jaideep
    Shafiq, Basit
    Adam, Nabil
    ICT SYSTEMS SECURITY AND PRIVACY PROTECTION, SEC 2017, 2017, 502 : 155 - 170
  • [23] Evidential k-NN for Link Prediction
    Mallek, Sabrine
    Boukhris, Imen
    Elouedi, Zied
    Lefevre, Eric
    SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, ECSQARU 2017, 2017, 10369 : 201 - 211
  • [24] Anonymizing k-NN Classification on MapReduce
    Bazai, Sibghat Ullah
    Jang-Jaccard, Julian
    Wang, Ruili
    MOBILE NETWORKS AND MANAGEMENT (MONAMI 2017), 2018, 235 : 364 - 377
  • [25] Learned k-NN Distance Estimation
    Amagata, Daichi
    Arai, Yusuke
    Fujita, Sumio
    Hara, Takahiro
    30TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS, ACM SIGSPATIAL GIS 2022, 2022, : 1 - 4
  • [26] Symmetrized Multivariate k-NN Estimators
    Fan, Yanqin
    Liu, Ruixuan
    ECONOMETRIC REVIEWS, 2015, 34 (6-10) : 827 - 847
  • [27] A FUZZY VERSION OF THE K-NN METHOD
    KISSIOV, VT
    HADJITODOROV, ST
    FUZZY SETS AND SYSTEMS, 1992, 49 (03) : 323 - 329
  • [28] Deep k-NN for Noisy Labels
    Bahri, Dara
    Jiang, Heinrich
    Gupta, Maya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [29] An Analysis of Order Dependence in k-NN
    McSherry, David
    Stretch, Christopher
    ARTIFICIAL INTELLIGENCE AND COGNITIVE SCIENCE, 2010, 6206 : 207 - 218
  • [30] Efficient K-NN for Playlist Continuation
    Kelen, Domokos M.
    Berecz, Daniel
    Beres, Ferenc
    Benczur, Andras A.
    RECSYS CHALLENGE'18: PROCEEDINGS OF THE ACM RECOMMENDER SYSTEMS CHALLENGE 2018, 2018,