IDPriU: A two-party ID-private data union protocol for privacy-preserving machine learning

被引:0
作者
Yan, Jianping [1 ]
Wei, Lifei [1 ]
Qian, Xiansong [2 ]
Zhang, Lei [2 ]
机构
[1] Shanghai Maritime Univ, Coll Informat Engn, Shanghai 201306, Peoples R China
[2] Shanghai Ocean Univ, Coll Informat Technol, Shanghai 201306, Peoples R China
基金
上海市自然科学基金;
关键词
Private data union; Privacy-preserving machine learning; Data security; Data preprocessing; Private set union;
D O I
10.1016/j.jisa.2024.103913
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to significant data security concerns in machine learning, such as the data silo problem, there has been a growing trend towards the development of privacy-preserving machine learning applications. The initial step in training data across silos involves establishing secure data joins, specifically private data joins, to ensure the consistency and accuracy of the dataset. While the majority of current research focuses on the inner join of private data, this paper specifically addresses the privacy-preserving full join of private data and develops two-party unbalanced private data full join protocols utilizing secure multi-party computation tools. Notably, our paper introduces the novel component of Private Match-and-Connect (PMC), which performs a union operation on the ID and feature values, and ensure the secret sharing of the resulting union set. Each participant receives only a portion of the secret share, thereby guaranteeing data security during the pre-processing phase. Furthermore, we propose the two-party ID-private data union protocol (IDPriU), which facilitates secure and accurate matching of feature value shares and ID shares and also enables the data alignment. Our protocol represents a significant advancement in the field of privacy-preserving data preprocessing in machine learning and privacy-preserving federated queries. It extends the concept that private data joins are limited to inner connections, offering a novel approach by Private Set Union (PSU). We have experimentally implemented our protocol and obtained favorable results in terms of both runtime and communication overhead.
引用
收藏
页数:13
相关论文
共 57 条
  • [1] State-of-the-art in artificial neural network applications: A survey
    Abiodun, Oludare Isaac
    Jantan, Aman
    Omolara, Abiodun Esther
    Dada, Kemi Victoria
    Mohamed, Nachaat AbdElatif
    Arshad, Humaira
    [J]. HELIYON, 2018, 4 (11)
  • [2] Blanton M., 2012, P 7 ACM S INF COMP C, P40
  • [3] Buddhavarapu P., 2020, Cryptology ePrint Archive, Report 2020/599
  • [4] A genetic algorithm for multivariate missing data imputation
    Carlos Figueroa-Garcia, Juan
    Neruda, Roman
    Hernandez-Perez, German
    [J]. INFORMATION SCIENCES, 2023, 619 : 947 - 967
  • [5] Chase Melissa, 2020, Advances in Cryptology - ASIACRYPT 2020. 26th International Conference on the Theory and Application of Cryptology and Information Security. Proceedings. Lecture Notes in Computer Science (LNCS 12393), P342, DOI 10.1007/978-3-030-64840-4_12
  • [6] Labeled PSI from Fully Homomorphic Encryption with Malicious Security
    Chen, Hao
    Huang, Zhicong
    Laine, Kim
    Rindal, Peter
    [J]. PROCEEDINGS OF THE 2018 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'18), 2018, : 1223 - 1237
  • [7] Good practice in Bayesian network modelling
    Chen, Serena H.
    Pollino, Carmel A.
    [J]. ENVIRONMENTAL MODELLING & SOFTWARE, 2012, 37 : 134 - 145
  • [8] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [9] Private Set Operations from Multi-query Reverse Private Membership Test
    Chen, Yu
    Zhang, Min
    Zhang, Cong
    Dong, Minglang
    Liu, Weiran
    [J]. PUBLIC-KEY CRYPTOGRAPHY, PT III, PKC 2024, 2024, 14603 : 387 - 416
  • [10] Chourasia Rishav, P MACHINE LEARNING R