Efficient k-closest pair queries in general metric spaces

被引:0
|
作者
Gao, Yunjun [1 ,2 ]
Chen, Lu [1 ]
Li, Xinhan [1 ]
Yao, Bin [3 ]
Chen, Gang [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310003, Zhejiang, Peoples R China
[2] Zhejiang Univ, Innovat Joint Res Ctr Cyber Phys Soc Syst, Hangzhou 310003, Zhejiang, Peoples R China
[3] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200030, Peoples R China
来源
VLDB JOURNAL | 2015年 / 24卷 / 03期
关键词
k-Closest pair query; Metric space; Query processing; Cost model; Algorithm; NEAREST-NEIGHBOR SEARCH; COST MODEL; DISTANCE; ALGORITHMS; JOINS;
D O I
10.1007/s00778-015-0383-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Given two object sets and , a k-closest pair query finds closest object pairs from . This operation is common in many real-life applications such as GIS, data mining, and recommender systems. Although it has received much attention in the Euclidean space, there is little prior work on the metric space. In this paper, we study the problem of kCP query processing in general metric spaces, namely Metric kCP search, and propose several efficient algorithms using dynamic disk-based metric indexes (e.g., M-tree), which can be applied to arbitrary type of data as long as a certain metric distance is defined and satisfies the triangle inequality. Our approaches follow depth-first and/or best-first traversal paradigm(s), employ effective pruning rules based on metric space properties and the counting information preserved in the metric index, take advantage of aggressive pruning and compensation to further boost query efficiency, and derive a node-based cost model for retrieval. In addition, we extend our techniques to tackle two interesting variants of queries. Extensive experiments with both real and synthetic data sets demonstrate the performance of our proposed algorithms, the effectiveness of our developed pruning rules, and the accuracy of our presented cost model.
引用
收藏
页码:415 / 439
页数:25
相关论文
共 50 条
  • [1] Efficient k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}-closest pair queries in general metric spaces
    Yunjun Gao
    Lu Chen
    Xinhan Li
    Bin Yao
    Gang Chen
    The VLDB Journal, 2015, 24 (3) : 415 - 439
  • [2] Distributed k-Nearest Neighbor Queries in Metric Spaces
    Ding, Xin
    Zhang, Yuanliang
    Chen, Lu
    Gao, Yunjun
    Zheng, Baihua
    WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 236 - 252
  • [3] Aggregate k Nearest Neighbor Queries in Metric Spaces
    Ding, Xin
    Zhang, Yuanliang
    Chen, Lu
    Yang, Keyu
    Gao, Yunjun
    WEB AND BIG DATA (APWEB-WAIM 2018), PT II, 2018, 10988 : 317 - 333
  • [4] Algorithms for processing K-closest-pair queries in spatial databases
    Corral, A
    Manolopoulos, Y
    Theodoridis, Y
    Vassilakopoulos, M
    DATA & KNOWLEDGE ENGINEERING, 2004, 49 (01) : 67 - 104
  • [5] Distributed Similarity Queries in Metric Spaces
    Keyu Yang
    Xin Ding
    Yuanliang Zhang
    Lu Chen
    Baihua Zheng
    Yunjun Gao
    Data Science and Engineering, 2019, 4 : 93 - 108
  • [6] Distributed Similarity Queries in Metric Spaces
    Yang, Keyu
    Ding, Xin
    Zhang, Yuanliang
    Chen, Lu
    Zheng, Baihua
    Gao, Yunjun
    DATA SCIENCE AND ENGINEERING, 2019, 4 (02) : 93 - 108
  • [7] Processing Top-k Dominating Queries in Metric Spaces
    Tiakas, Eleftherios
    Valkanas, George
    Papadopoulos, Apostolos N.
    Manolopoulos, Yannis
    Gunopulos, Dimitrios
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2015, 40 (04):
  • [8] On efficient k-optimal-location-selection query processing in metric spaces
    Gao, Yunjun
    Qi, Shuyao
    Chen, Lu
    Zheng, Baihua
    Li, Xinhan
    INFORMATION SCIENCES, 2015, 298 : 98 - 117
  • [9] On Closest Pair in Euclidean Metric: Monochromatic is as Hard as Bichromatic
    Karthik, C. S.
    Manurangsi, Pasin
    COMBINATORICA, 2020, 40 (04) : 539 - 573
  • [10] An Effective Cost Model for Similarity Queries in Metric Spaces
    Baioco, Gisele Busichia
    Traina, Agma J. M.
    Traina, Caetano, Jr.
    APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 527 - 528