Real-Valued Embeddings and Sketches for Fast Distance and Similarity Estimation

被引:8
|
作者
Rachkovskij, D. A. [1 ,2 ]
机构
[1] NAS, Int Sci Educ Ctr Informat Technol & Syst, Kiev, Ukraine
[2] MON Ukraine, Int Sci Educ Ctr Informat Technol & Syst, Kiev, Ukraine
关键词
distance; similarity; embedding; sketch; dimensionality reduction; random projection; sampling; Johnson-Lindenstrauss lemma; kernel similarity; similarity search;
D O I
10.1007/s10559-016-9899-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This survey article considers methods and algorithms for fast estimation of data distance/similarity measures from formed real-valued vectors of small dimension. The methods do not use learning and mainly use random projection and sampling. Initial data are mainly high-dimensional vectors with different measures of distance (Euclidean, Manhattan, statistical, etc.) and similarity (dot product, etc.). Vector representations of non-vector data are also considered. The resultant vectors can also be used in similarity search algorithms, machine learning, etc.
引用
收藏
页码:967 / 988
页数:22
相关论文
共 27 条
  • [1] Binary Vectors for Fast Distance and Similarity Estimation
    Rachkovskij D.A.
    Rachkovskij, D.A. (dar@infrm.kiev.ua), 1600, Springer Science and Business Media, LLC (53): : 138 - 156
  • [2] Index Structures for Fast Similarity Search for Real-Valued Vectors. I
    Rachkovskij D.A.
    Cybernetics and Systems Analysis, 2018, 54 (1) : 152 - 164
  • [3] FAST COMPUTATION OF THE L1-PRINCIPAL COMPONENT OF REAL-VALUED DATA
    Kundu, Sandipan
    Markopoulos, Panos P.
    Pados, Dimitris A.
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] NEW MEANS OF CYBERNETICS, INFORMATICS, COMPUTER ENGINEERING, AND SYSTEMS ANALYSIS INDEX STRUCTURES FOR FAST SIMILARITY SEARCH FOR REAL-VALUED VECTORS. I
    Rachkovskij, D. A.
    CYBERNETICS AND SYSTEMS ANALYSIS, 2018, 54 (01) : 152 - 164
  • [5] A new class of metrics for learning on real-valued and structured data
    Ruiyu Yang
    Yuxiang Jiang
    Scott Mathews
    Elizabeth A. Housworth
    Matthew W. Hahn
    Predrag Radivojac
    Data Mining and Knowledge Discovery, 2019, 33 : 995 - 1016
  • [6] Isoperimetric inequalities for real-valued functions with applications to monotonicity testing
    Black, Hadley
    Kalemaj, Iden
    Raskhodnikova, Sofya
    RANDOM STRUCTURES & ALGORITHMS, 2024, 65 (01) : 191 - 219
  • [7] Three-way decision for incomplete real-valued data
    Wen, Haili
    Xia, Fei
    Tang, Hongxiang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (05) : 7843 - 7862
  • [8] A new class of metrics for learning on real-valued and structured data
    Yang, Ruiyu
    Jiang, Yuxiang
    Mathews, Scott
    Housworth, Elizabeth A.
    Hahn, Matthew W.
    Radivojac, Predrag
    DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 33 (04) : 995 - 1016
  • [9] Fast Similarity Search for Graphs by Edit Distance
    D. A. Rachkovskij
    Cybernetics and Systems Analysis, 2019, 55 : 1039 - 1051
  • [10] Fast Similarity Search for Graphs by Edit Distance
    Rachkovskij, D. A.
    CYBERNETICS AND SYSTEMS ANALYSIS, 2019, 55 (06) : 1039 - 1051