Real-Valued Embeddings and Sketches for Fast Distance and Similarity Estimation

被引:8
|
作者
Rachkovskij, D. A. [1 ,2 ]
机构
[1] NAS, Int Sci Educ Ctr Informat Technol & Syst, Kiev, Ukraine
[2] MON Ukraine, Int Sci Educ Ctr Informat Technol & Syst, Kiev, Ukraine
关键词
distance; similarity; embedding; sketch; dimensionality reduction; random projection; sampling; Johnson-Lindenstrauss lemma; kernel similarity; similarity search;
D O I
10.1007/s10559-016-9899-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This survey article considers methods and algorithms for fast estimation of data distance/similarity measures from formed real-valued vectors of small dimension. The methods do not use learning and mainly use random projection and sampling. Initial data are mainly high-dimensional vectors with different measures of distance (Euclidean, Manhattan, statistical, etc.) and similarity (dot product, etc.). Vector representations of non-vector data are also considered. The resultant vectors can also be used in similarity search algorithms, machine learning, etc.
引用
收藏
页码:967 / 988
页数:22
相关论文
共 27 条
  • [21] A Study of Distance/Similarity Measurements in the context of Signal Processing (Density Estimation)
    Souza, David M.
    Costa, Igor A.
    Nobrega, Rafael A.
    2017 2ND INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION SYSTEMS, CIRCUITS AND TRANSDUCERS (INSCIT), 2017, : 105 - 110
  • [22] Mash: fast genome and metagenome distance estimation using MinHash
    Ondov, Brian D.
    Treangen, Todd J.
    Melsted, Pall
    Mallonee, Adam B.
    Bergman, Nicholas H.
    Koren, Sergey
    Phillippy, Adam M.
    GENOME BIOLOGY, 2016, 17
  • [23] Index Structures for Fast Similarity Search for Real Vectors. II*
    Rachkovskij, D. A.
    CYBERNETICS AND SYSTEMS ANALYSIS, 2018, 54 (02) : 320 - 335
  • [24] Fast Similarity Search with the Earth Mover's Distance via Feasible Initialization and Pruning
    Uysal, Merih Seran
    Driessen, Kai
    Brockhoff, Tobias
    Seidl, Thomas
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2017, 2017, 10609 : 141 - 155
  • [25] Hausdorff Distance and Similarity Measures for Single-Valued Neutrosophic Sets with Application in Multi-Criteria Decision Making
    Ali, Mehboob
    Hussain, Zahid
    Yang, Miin-Shen
    ELECTRONICS, 2023, 12 (01)
  • [26] Distance Distribution and Average Shortest Path Length Estimation in Real-World Networks
    Ye, Qi
    Wu, Bin
    Wang, Bai
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2010, PT I, 2010, 6440 : 322 - 333
  • [27] Speckle-based sensor system for real-time distance and thickness monitoring of fast moving objects
    Semenov, D. V.
    Sidorov, I. S.
    Nippolainen, E.
    Kamshilin, A. A.
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2010, 21 (04)