Index Structures for Fast Similarity Search for Symbol Strings

被引:0
作者
D. A. Rachkovskij
机构
[1] NAS of Ukraine and MES of Ukraine,International Research and Training Center for Information Technologies and Systems
来源
Cybernetics and Systems Analysis | 2019年 / 55卷
关键词
similarity search; edit distance; nearest neighbor; near neighbor; index structure; inverted indexing; n-gram; locality-sensitive hashing; treelike structure;
D O I
暂无
中图分类号
学科分类号
摘要
This article surveys index structures for fast similarity search for objects represented by symbol strings. Index structures both for exact and approximate searches by edit distance are considered. Index structures based on inverted indexing, similarity preserving hashing, and treelike structures are mainly presented. Ideas of well-known and recently proposed algorithms are described.
引用
收藏
页码:860 / 878
页数:18
相关论文
共 104 条
[1]  
Rachkovskij DA(2016)Real-valued vectors for fast distance and similarity estimation Cybernetics and Systems Analysis 52 967-988
[2]  
Rachkovskij DA(2017)Binary vectors for fast distance and similarity estimation Cybernetics and Systems Analysis 53 138-156
[3]  
Rachkovskij DA(2017)Distance-based index structures for fast similarity search Cybernetics and Systems Analysis 53 636-658
[4]  
Rachkovskij DA(2017)Index structures for fast similarity search for binary vectors Cybernetics and Systems Analysis 53 799-820
[5]  
Rachkovskij DA(2018)Index structures for fast similarity search for real-valued vectors. I Cybernetics and Systems Analysis 54 152-164
[6]  
Rachkovskij DA(2018)Index structures for fast similarity search for real-valued vectors. II Cybernetics and Systems Analysis 54 320-335
[7]  
Boytsov L(2011)Indexing methods for approximate dictionary searching: Comparative analysis J. Exp. Algorithmics 16 1.1:1-1.1:91
[8]  
Jiang Y(2014)String similarity joins: An experimental evaluation Proc. VLDB Endowment 7 625-636
[9]  
Li G(2016)String similarity search and join: A survey Frontiers of Computer Science 10 399-417
[10]  
Feng J(2008)Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions Comm. ACM 51 117-122