Privacy-Preserving Collaborative Data Collection and Analysis With Many Missing Values

被引:10
|
作者
Sei, Yuichi [1 ,2 ]
Onesimu, J. Andrew [3 ]
Okumura, Hiroshi [4 ]
Ohsuga, Akihiko [1 ]
机构
[1] Univ Electrocommun, Tokyo 1828585, Japan
[2] PRESTO, JST, Kawaguchi, Saitama 3320012, Japan
[3] Manipal Acad Higher Educ, Manipal Inst Technol, Dept Comp Sci & Engn, Manipal 576104, India
[4] Mitsubishi Res Inst, Tokyo 1008141, Japan
关键词
Data collection; Servers; Differential privacy; Data models; COVID-19; Privacy; Hospitals; differential privacy; missing values; multi-dimensional analysis; privacy-preserving data collection; MEMBERSHIP INFERENCE ATTACKS; VALUE IMPUTATION; COPULAS; NOISE;
D O I
10.1109/TDSC.2022.3174887
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Privacy-preserving data mining techniques are useful for analyzing various information, such as Internet of Things data and COVID-19-related patient data. However, collecting a large amount of sensitive personal information is a challenging task. In addition, this information may have missing values, which are not considered in the existing methods for collecting personal information while ensuring data privacy. Failure to account for missing values reduces the accuracy of the data analysis. In this article, we propose a method for privacy-preserving data collection that considers many missing values. The patient data are anonymized and sent to a data collection server. The data collection server creates a generative model and a contingency table suitable for multi-attribute analysis based on expectation-maximization and Gaussian copula methods. Using differential privacy (the de facto standard) as a privacy metric, we conduct experiments on synthetic and real data, including COVID-19-related data. The results are 50-80% more accurate than those of existing methods that do not consider missing values.
引用
收藏
页码:2158 / 2173
页数:16
相关论文
共 50 条
  • [1] Privacy-Preserving Vertical Federated Learning With Tensor Decomposition for Data Missing Features
    Liao, Tianchi
    Fu, Lele
    Zhang, Lei
    Yang, Lei
    Chen, Chuan
    Ng, Michael K.
    Huang, Huawei
    Zheng, Zibin
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 3445 - 3460
  • [2] Privacy-Preserving SRS Data Anonymization by Incorporating Missing Values
    Lin, Wen-Yang
    Hsu, Kuang-Yung
    Shen, Zih-Xun
    2018 CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI), 2018, : 106 - 109
  • [3] Impact of social learning on privacy-preserving data collection
    Akbay A.B.
    Wang W.
    Zhang J.
    Akbay, Abdullah Basar (aakbay@asu.edu), 1600, Institute of Electrical and Electronics Engineers Inc. (02): : 268 - 282
  • [4] Interval Privacy: A Framework for Privacy-Preserving Data Collection
    Ding, Jie
    Ding, Bangjun
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2022, 70 : 2443 - 2459
  • [5] Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis
    Ma, Jing
    Zhang, Qiuchen
    Lou, Jian
    Ho, Joyce C.
    Xiong, Li
    Jiang, Xiaoqian
    PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM '19), 2019, : 1291 - 1300
  • [6] Privacy-Preserving Probabilistic Data Encoding for IoT Data Analysis
    Zaman, Zakia
    Xue, Wanli
    Gauravaram, Praveen
    Hu, Wen
    Jiang, Jiaojiao
    Jha, Sanjay K.
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 9173 - 9187
  • [7] Data Protection: Privacy-Preserving Data Collection With Validation
    Hou, Jiahui
    Liu, Dongxiao
    Huang, Cheng
    Zhuang, Weihua
    Shen, Xuemin
    Sun, Rob
    Ying, Bidi
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2024, 21 (04) : 3422 - 3438
  • [8] Privacy-preserving collaborative data mining
    Zhan, J
    Chang, LW
    Matwin, S
    FOUNDATIONS AND NOVEL APPROACHES IN DATA MINING, 2006, 9 : 213 - +
  • [9] Privacy-preserving distributed collaborative filtering
    Boutet, Antoine
    Frey, Davide
    Guerraoui, Rachid
    Jegou, Arnaud
    Kermarrec, Anne-Marie
    COMPUTING, 2016, 98 (08) : 827 - 846
  • [10] Privacy-preserving distributed collaborative filtering
    Antoine Boutet
    Davide Frey
    Rachid Guerraoui
    Arnaud Jégou
    Anne-Marie Kermarrec
    Computing, 2016, 98 : 827 - 846