Solving Truthfulness-Privacy Trade-Off in Mixed Data Outsourcing by Using Data Balancing and Attribute Correlation-Aware Differential Privacy

被引:0
|
作者
Majeed, Abdul [1 ]
Hwang, Seong Oun [1 ]
机构
[1] Gachon Univ, Dept Comp Engn, Seongnam 13120, South Korea
来源
IEEE ACCESS | 2025年 / 13卷
基金
新加坡国家研究基金会;
关键词
Data privacy; Outsourcing; Data models; Privacy; Noise; Information integrity; Information filtering; Usability; Sensitivity; Correlation; Personal data; differential privacy; data truthfulness; attribute correlations; data balancing; K-ANONYMITY;
D O I
10.1109/ACCESS.2025.3537109
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the modern era, data of diverse types (medical, financial, etc.) are outsourced from data owner environments to the public domains for data mining and knowledge discovery purposes. However, data often encompass sensitive information about individuals, and outsourcing the data without sufficient protection may endanger privacy. Anonymization methods are mostly used in data outsourcing to protect privacy; however, it is very hard to apply anonymity to datasets of poor quality while maintaining an equilibrium between privacy, utility, and truthfulness (i.e., ensuring the values in anonymized data are consistent with the real data). To address these technical problems, we propose and implement a data balancing and attribute correlation-aware differential privacy (DP) method for mixed data outsourcing while accomplishing the three crucial objectives of privacy, truthfulness, and utility. Our method first identifies quality-related issues in the data and solves them in an automated manner by adding the fewest possible good-quality synthetic records. We propose a data partitioning method that exploits correlations between attributes to create blocks of data to lessen the amount of noise added by the DP model. To preserve higher truthfulness while guaranteeing privacy, categorical attributes are considered as one unit, and an exponential mechanism is applied to them. The numerical attributes are transformed using the Laplace mechanism with a relatively higher & varepsilon;. The joint application of these mechanisms to data blocks enables effective resolution of the truthfulness-privacy tradeoff, and data usability is extremely high. Extensive experiments are performed on three benchmark datasets to demonstrate the effectiveness of our method in real scenarios. The experiment results and analysis indicate significantly better performance on four different evaluation metrics compared to the recent state-of-the-art (SOTA) DP-based methods. Furthermore, our method has better efficiency than its counterparts.
引用
收藏
页码:23171 / 23194
页数:24
相关论文
共 15 条
  • [1] Solving the Privacy-Equity Trade-off in Data Sharing By Using Homophily, Diversity, and t-Closeness Based Anonymity Algorithm
    Majeed, Abdul
    Hwang, Seong Oun
    IEEE ACCESS, 2024, 12 : 181953 - 181974
  • [2] Towards a data privacy-predictive performance trade-off
    Carvalho, Tania
    Moniz, Nuno
    Faria, Pedro
    Antunes, Luis
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 223
  • [3] Communicating the Privacy-Utility Trade-off: Supporting Informed Data Donation with Privacy Decision Interfaces for Differential Privacy
    Franzen, Daniel
    Müller-Birn, Claudia
    Wegwarth, Odette
    Proceedings of the ACM on Human-Computer Interaction, 2024, 8 (CSCW1)
  • [4] Differential Privacy Enabled Dementia Classification: An Exploration of the Privacy-Accuracy Trade-off in Speech Signal Data
    Suhas, B. N.
    Rajtmajer, Sarah
    Abdullah, Saeed
    INTERSPEECH 2023, 2023, : 346 - 350
  • [5] Balancing the trade-off between privacy and profitability in Social Media using NMSANT
    Ranjan, Rahul
    Charul
    Vyas, Devina
    Guntoju, Durga Prasad
    SOUVENIR OF THE 2014 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2014, : 477 - 483
  • [6] Privacy and Personalization: The Trade-off between Data Disclosure and Personalization Benefit
    Wadle, Lisa-Marie
    Martin, Noemi
    Ziegler, Daniel
    ADJUNCT PUBLICATION OF THE 27TH CONFERENCE ON USER MODELING, ADAPTATION AND PERSONALIZATION (ACM UMAP '19 ADJUNCT), 2019, : 319 - 324
  • [7] A Survey on Privacy Preserving Synthetic Data Generation and a Discussion on a Privacy-Utility Trade-off Problem
    Ghatak, Debolina
    Sakurai, Kouichi
    SCIENCE OF CYBER SECURITY, SCISEC 2022 WORKSHOPS, 2022, 1680 : 167 - 180
  • [8] Privacy/performance trade-off in private search on bio-medical data
    Perl, H.
    Mohammed, Y.
    Brenner, M.
    Smith, M.
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2014, 36 : 441 - 452
  • [9] On the Privacy-Utility Trade-Off With and Without Direct Access to the Private Data
    Zamani, Amirreza
    Oechtering, Tobias J.
    Skoglund, Mikael
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (03) : 2177 - 2200
  • [10] Determining privacy utility trade-off for Online Social Network data publishing
    Srivastava, Agrima
    Geethakumari, G.
    2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,