Indexing Earth Mover's Distance over Network Metrics

被引:3
|
作者
Wang, Ting [1 ]
Meng, Shicong [2 ]
Bian, Jiang [3 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Hts, NY 10598 USA
[2] Facebook Inc, Menlo Pk, CA 94025 USA
[3] Microsoft Res Asia, Beijing 100080, Peoples R China
关键词
Earth mover's distance; network metrics; similarity search; delimit and filter;
D O I
10.1109/TKDE.2014.2373359
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Earth Mover's Distance (EMD) is a well-known distance metric for data represented as probability distributions over a predefined feature space. Supporting EMD-based similarity search has attracted intensive research effort. Despite the plethora of literature, most existing solutions are optimized for L-p feature spaces (e.g., Euclidean space); while in a spectrum of applications, the relationships between features are better captured using networks. In this paper, we study the problem of answering k-nearest neighbor (k-NN) queries under network-based EMD metrics (NEMD). We propose OASIS, a new access method which leverages the network structure of feature space and enables efficient NEMD-based similarity search. Specifically, OASIS employs three novel techniques: (i) Range Oracle, a scalable model to estimate the range of k-th nearest neighbor under NEMD, (ii) Boundary Index, a structure that efficiently fetches candidates within given range, and (iii) Network Compression Hierarchy, an incremental filtering mechanism that effectively prunes false positive candidates to save unnecessary computation. Through extensive experiments using both synthetic and real data sets, we confirmed that OASIS significantly outperforms the state-of-the-art methods in query processing cost.
引用
收藏
页码:1588 / 1601
页数:14
相关论文
共 50 条
  • [21] A New Measure of Congruence: The Earth Mover's Distance
    Lupu, Noam
    Selios, Lucia
    Warner, Zach
    POLITICAL ANALYSIS, 2017, 25 (01) : 95 - 113
  • [22] Sublinear Time Algorithms for Earth Mover’s Distance
    Khanh Do Ba
    Huy L. Nguyen
    Huy N. Nguyen
    Ronitt Rubinfeld
    Theory of Computing Systems, 2011, 48 : 428 - 442
  • [23] The Earth Mover's Distance as a Metric for Image Retrieval
    Yossi Rubner
    Carlo Tomasi
    Leonidas J. Guibas
    International Journal of Computer Vision, 2000, 40 : 99 - 121
  • [24] The Earth Mover's Distance as a metric for image retrieval
    Rubner, Y
    Tomasi, C
    Guibas, LJ
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2000, 40 (02) : 99 - 121
  • [25] EARTH-MOVER'S DISTANCE AS A TRACKING REGULARIZER
    Charles, Adam S.
    Bertrand, Nicholas P.
    Lee, John
    Rozell, Christopher J.
    2017 IEEE 7TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP), 2017,
  • [26] EARTH MOVER DISTANCE ON SUPERPIXELS
    Boltz, Sylvain
    Nielsen, Frank
    Soatto, Stefano
    2010 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, 2010, : 4597 - 4600
  • [27] Efficient and effective similarity search over probabilistic data based on Earth Mover's Distance
    Xu, Jia
    Zhang, Zhenjie
    Tung, Anthony K. H.
    Yu, Ge
    VLDB JOURNAL, 2012, 21 (04): : 535 - 559
  • [28] Efficient and Effective Similarity Search over Probabilistic Data based on Earth Mover's Distance
    Xu, Jia
    Zhang, Zhenjie
    Tung, Anthony K. H.
    Yu, Ge
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01): : 758 - 769
  • [29] Efficient and effective similarity search over probabilistic data based on Earth Mover’s Distance
    Jia Xu
    Zhenjie Zhang
    Anthony K. H. Tung
    Ge Yu
    The VLDB Journal, 2012, 21 : 535 - 559
  • [30] Earth Mover's Distance Pooling over Siamese LSTMs for Automatic Short Answer Grading
    Kumar, Sachin
    Chakrabarti, Soumen
    Roy, Shourya
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2046 - 2052