Quicker Similarity Joins in Metric Spaces

被引:0
|
作者
Fredriksson, Kimmo [1 ]
Braithwaite, Billy [1 ]
机构
[1] Univ Eastern Finland, Sch Comp, Kuopio, Finland
来源
SIMILARITY SEARCH AND APPLICATIONS (SISAP) | 2013年 / 8199卷
关键词
RANGE QUERIES; SEARCH; INDEX;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the join operation in metric spaces. Given two sets A and B of objects drawn from some universe U, we want to compute the set A (sic) B = {(a, b) is an element of A x B vertical bar d(a, b) <= r} efficiently, where d : U x U -> R+ is a metric distance function and r is an element of R+ is user supplied query radius. In particular we are interested in the case where we have no index available (nor we can afford to build it) for either A or B. In this paper we improve the Quickjoin algorithm (Jacox and Samet, 2008), based on the well-know Quicksort algorithm, by (i) replacing the low level component that handles small subsets with essentially brute-force nested loop with a more efficient method; (ii) showing that, contrary to Quicksort, in Quickjoin unbalanced partitioning can improve the algorithm; and (iii) making the algorithm probabilistic while still obtaining most of the relevant results. We also show how to use Quickjoin to compute k-nearest neighbor joins. The experimental results show that the method works well in practice.
引用
收藏
页码:127 / 140
页数:14
相关论文
共 50 条
  • [1] Exploiting Database Similarity Joins for Metric Spaces
    Silva, Yasin N.
    Pearson, Spencer
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 1922 - 1925
  • [2] Quicker range- and k-NN joins in metric spaces
    Fredriksson, Kimmo
    Braithwaite, Billy
    INFORMATION SYSTEMS, 2015, 52 : 189 - 204
  • [3] An efficient algorithm for approximated self-similarity joins in metric spaces
    Ferrada, Sebastian
    Bustos, Benjamin
    Reyes, Nora
    INFORMATION SYSTEMS, 2020, 91
  • [4] List of twin clusters: a data structure for similarity joins in metric spaces
    Paredes, Rodrigo
    Reyes, Nora
    2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1 AND 2, 2008, : 578 - +
  • [5] List of twin clusters: a data structure for similarity joins in metric spaces
    Paredes, Rodrigo
    Reyes, Nora
    SISAP 2008: FIRST INTERNATIONAL WORKSHOP ON SIMILARITY SEARCH AND APPLICATIONS, PROCEEDINGS, 2008, : 131 - 138
  • [6] Metric space similarity joins
    Jacox, Edwin H.
    Samet, Hanan
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (02):
  • [7] Solving similarity joins and range queries in metric spaces with the list of twin clusters
    Paredes, Rodrigo
    Reyes, Nora
    JOURNAL OF DISCRETE ALGORITHMS, 2009, 7 (01) : 18 - 35
  • [8] Flows and joins of metric spaces
    Mineyev, I
    GEOMETRY & TOPOLOGY, 2005, 9 : 403 - 482
  • [9] Metric Similarity Joins Using MapReduce
    Chen, Gang
    Yang, Keyu
    Chen, Lu
    Gao, Yunjun
    Zheng, Baihua
    Chen, Chun
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (03) : 656 - 669
  • [10] Efficient Metric Indexing for Similarity Search and Similarity Joins
    Chen, Lu
    Gao, Yunjun
    Li, Xinhan
    Jensen, Christian S.
    Chen, Gang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (03) : 556 - 571