An efficient secure k nearest neighbor classification protocol with high-dimensional features

被引:10
作者
Sun, Maohua [1 ]
Yang, Ruidi [1 ]
机构
[1] Capital Univ Econ & Business, Sch Management & Engn, Beijing 100070, Peoples R China
关键词
fully homomorphic scheme; honest-but-curious model; machine learning; oblivious transfer; privacy-preserving Euclidean distance; secure kNN; secure multi-party computation; MACHINE;
D O I
10.1002/int.22272
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k Nearest neighbor (kNN) classification algorithm is a prediction model which is widely used for real-life applications, such as healthcare, finance, computer vision, personalization recommendation and precision marketing. The arrival of data explosion era results in the significant increase of feature dimension, which also makes for the increase of privacy concern over the available samples and unlabeled data in the applications of machine learning. In this paper, we present a secure low communication overhead kNN classification protocol that is able to deal with high-dimensional features given in real numbers. First, to deal with feature values given in real numbers, we develop a specific data conversion algorithm, which is used in the chosen fully homomorphic scheme. This conversion algorithm is generic and applicable to other algorithms that need to handle real numbers using the fully homomorphic scheme. Second, we present a privacy-preserving euclidean distance protocol (PPEDP), which works with the Euclidean distance computation between two points given in real numbers in a high-dimensional space. Then, based on the novelty PPEDP and oblivious transfer, we propose a new classification approach, efficient secure kNN classification protocol, (ESkNN) with low communication overhead, which is appropriate for a sample set with high-dimensional features and real number feature values. Moreover, we implement ESkNN in C++. Experimental results show that ESkNN is several orders of magnitude faster in performance than existing works, and scales up to 18 000 feature dimension in a memory limited environment.
引用
收藏
页码:1791 / 1813
页数:23
相关论文
共 38 条
  • [1] [Anonymous], 2016, PRIVLOGIT EFFICIENT
  • [2] [Anonymous], 2006, KDD, DOI DOI 10.1145/1150402.1150477
  • [3] [Anonymous], 2017, IACR Cryptol. ePrint Arch.
  • [4] [Anonymous], 2018, IACR Cryptol. ePrint Arch.
  • [5] [Anonymous], 2016, OBLIVIOUSTRANSFER OB
  • [6] Asharov Gilad, 2018, Proceedings on Privacy Enhancing Technologies, V2018, P104, DOI 10.1515/popets-2018-0034
  • [7] Barnett A., 2017, IACR CRYPTOL EPRINT, V2017, P857
  • [8] Brakerski Zvika, 2014, ACM Transactions on Computation Theory, V6, DOI 10.1145/2633600
  • [9] Chen H, 2019, SANNS SCALING SECURE
  • [10] Cock MD, 2020, HIGH PERFORMANCE LOG