Efficient Similarity Search in Scientific Databases with Feature Signatures

被引:2
|
作者
Uysal, Merih Seran [1 ]
Beecks, Christian [1 ]
Schmuecking, Jochen [1 ]
Seidl, Thomas [1 ]
机构
[1] Rhein Westfal TH Aachen, Data Management & Explorat Grp, Aachen, Germany
来源
PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT | 2015年
关键词
Scientific Databases; Feature Signatures; Earth Mover's Distance; Lower Bound; EARTH MOVERS DISTANCE;
D O I
10.1145/2791347.2791384
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recent rapid growth of scientific data necessitates efficient similarity search techniques for which convenient object representation models are of vital importance. Feature signatures denoting highly flexible object feature representations have increasingly gained attention for which corresponding efficiency improvement techniques are developed. In this paper, we focus on efficient query processing with the well-known Earth Mover's Distance (EMD) on databases of feature signatures, and propose efficient approximation techniques successfully applicable to high-dimensional feature signatures via dimensionality reduction, guaranteeing both completeness and no false-dismissal within a filter-and-refine architecture. Rigorous experiments on real world data indicate a considerable reduction in the number of EMD computations and high efficiency of the proposed techniques which significantly reduce the query processing time.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Geometric Graph Indexing for Similarity Search in Scientific Databases
    Armiti, Ayser
    Gertz, Michael
    28TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM) 2016), 2016,
  • [2] Efficient similarity search for hierarchical data in large databases
    Kailing, K
    Kriegel, HP
    Schönauer, S
    Seidl, T
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2004, PROCEEDINGS, 2004, 2992 : 676 - 693
  • [3] Efficient similarity search on multidimensional space of biometric databases
    Jayaraman, Umarani
    Gupta, Phalguni
    NEUROCOMPUTING, 2021, 452 : 623 - 652
  • [4] ISIS: A New Approach for Efficient Similarity Search in Sparse Databases
    Cui, Bin
    Zhao, Jiakui
    Cong, Gao
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PT II, PROCEEDINGS, 2010, 5982 : 231 - +
  • [5] An efficient similarity search based on indexing in large DNA databases
    Jeong, In-Seon
    Park, Kyoung-Wook
    Kang, Seung-Ho
    Lim, Hyeong-Seok
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2010, 34 (02) : 131 - 136
  • [6] Efficient Graph Similarity Search Over Large Graph Databases
    Zheng, Weiguo
    Zou, Lei
    Lian, Xiang
    Wang, Dong
    Zhao, Dongyan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (04) : 964 - 978
  • [7] Anticipatory DTW for Efficient Similarity Search in Time Series Databases
    Assent, Ira
    Wichterich, Marc
    Krieger, Ralph
    Kremer, Hardy
    Seidl, Thomas
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (01):
  • [8] Efficient Subgraph Similarity Search on Large Probabilistic Graph Databases
    Yuan, Ye
    Wang, Guoren
    Chent, Lei
    Wang, Haixun
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (09): : 800 - 811
  • [9] Efficient similarity search in large databases of tree structured objects
    Kailing, K
    Kriegel, HP
    Schönauer, S
    Seidl, T
    20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 835 - 835
  • [10] Piers: An efficient model for similarity search in DNA sequence databases
    Cao, X
    Li, SC
    Ooi, BC
    Tung, AKH
    SIGMOD RECORD, 2004, 33 (02) : 39 - 44