Greedy Filtering: A Scalable Algorithm for K-Nearest Neighbor Graph Construction

被引:0
|
作者
Park, Youngki [1 ]
Park, Sungchan [1 ]
Lee, Sang-goo [1 ]
Jung, Woosung [2 ]
机构
[1] Seoul Natl Univ, Sch Comp Sci & Engn, Seoul 151, South Korea
[2] Chungbuk Natl Univ, Dept Comp Engn, Jeonju, South Korea
来源
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT I | 2014年 / 8421卷
基金
新加坡国家研究基金会;
关键词
k-nearest neighbor graph; similarity join; similarity search; SIMILARITY JOINS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Finding the k-nearest neighbors for every node is one of the most important data mining tasks as a primitive operation in the field of Information Retrieval and Recommender Systems. However, existing approaches to this problem do not perform as well when the number of nodes or dimensions is scaled up. In this paper, we present greedy filtering, an efficient and scalable algorithm for finding an approximate k-nearest neighbor graph by filtering node pairs whose large value dimensions do not match at all. In order to avoid skewness in the results and guarantee a time complexity of O(n), our algorithm chooses essentially a fixed number of node pairs as candidates for every node. We also present a faster version of greedy filtering based on the use of inverted indices for the node prefixes. We conduct extensive experiments in which we (i) compare our approaches to the state-of-the-art algorithms in seven different types of datasets, and (ii) adopt other algorithms in related fields (similarity join, top-k similarity join and similarity search fields) to solve this problem and evaluate them. The experimental results show that greedy filtering guarantees a high level of accuracy while also being much faster than other algorithms for large amounts of high-dimensional data.
引用
收藏
页码:327 / 341
页数:15
相关论文
共 50 条
  • [1] Scalable K-Nearest Neighbor Graph Construction Based on Greedy Filtering
    Park, Youngki
    Park, Sungchan
    Lee, Sang-goo
    Jung, Woosung
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 227 - 228
  • [2] A novel algorithm for scalable k-nearest neighbour graph construction
    Park, Youngki
    Hwang, Heasoo
    Lee, Sang-goo
    JOURNAL OF INFORMATION SCIENCE, 2016, 42 (02) : 274 - 288
  • [3] Fast Collaborative Filtering with a k-Nearest Neighbor Graph
    Park, Youngki
    Park, Sungchan
    Lee, Sang-goo
    Jung, Woosung
    2014 INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2014, : 92 - +
  • [4] Fast Algorithm for Approximate k-Nearest Neighbor Graph Construction3
    Wang, Dilin
    Shi, Lei
    Cao, Jianwen
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2013, : 349 - 356
  • [5] Optimization of the Neighbor Parameter of k-Nearest Neighbor Algorithm for Collaborative Filtering
    Vaghela, Vimalkumar B.
    Pathak, Himalay H.
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMMUNICATION AND NETWORKS, 2017, 508 : 87 - 93
  • [6] Reversed CF: A fast collaborative filtering algorithm using a k-nearest neighbor graph
    Park, Youngki
    Park, Sungchan
    Jung, Woosung
    Lee, Sang-goo
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (08) : 4022 - 4028
  • [7] A Scalable K-Nearest Neighbor Algorithm for Recommendation System Problems
    Sagdic, A.
    Tekinbas, C.
    Arslan, E.
    Kucukyilmaz, T.
    2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020), 2020, : 186 - 191
  • [8] Fast Parallel Cosine K-Nearest Neighbor Graph Construction
    Anastasiu, David C.
    Karypis, George
    PROCEEDINGS OF 2016 6TH WORKSHOP ON IRREGULAR APPLICATIONS: ARCHITECTURE AND ALGORITHMS (IA3), 2016, : 50 - 53
  • [9] Improvement of k-nearest neighbor algorithm based on double filtering
    Ma, Chun Jie
    Ding, Zheng Sheng
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 1567 - 1570
  • [10] Quantum K-nearest neighbor algorithm
    Chen, Hanwu
    Gao, Yue
    Zhang, Jun
    Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2015, 45 (04): : 647 - 651