When AI Meets Information Privacy: The Adversarial Role of AI in Data Sharing Scenario

被引:11
作者
Majeed, Abdul [1 ]
Hwang, Seong Oun [1 ]
机构
[1] Gachon Univ, Dept Comp Engn, Seongnam 13120, South Korea
关键词
AI-powered attacks; artificial intelligence; background knowledge; compromising privacy; data publishing; personal data; privacy; safeguarding privacy; synthetic data; utility; UTILITY;
D O I
10.1109/ACCESS.2023.3297646
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Artificial intelligence (AI) is a transformative technology with a substantial number of practical applications in commercial sectors such as healthcare, finance, aviation, and smart cities. AI also has strong synergy with the information privacy (IP) domain from two distinct aspects: as a protection tool (i.e., safeguarding privacy), and as a threat tool (i.e., compromising privacy). In the former case, AI techniques are amalgamated with the traditional anonymization techniques to improve various key components of the anonymity process, and therefore, privacy is safeguarded effectively. In the latter case, some adversarial knowledge is aggregated with the help of AI techniques and subsequently used to compromise the privacy of individuals. To the best of our knowledge, threats posed by AI-generated knowledge such as synthetic data (SD) to information privacy are often underestimated, and most of the existing anonymization methods do not consider/model this SD-based knowledge that can be available to the adversary, leading to privacy breaches in some cases. In this paper, we highlight the role of AI as a threat tool (i.e., AI used to compromise an individual's privacy), with a special focus on SD that can serve as background knowledge leading to various kinds of privacy breaches. For instance, SD can encompass pertinent information (e.g., total # of attributes in data, distributions of sensitive information, category values of each attribute, minor and major values of some attributes, etc.) about real data that can offer a helpful hint to the adversary regarding the composition of anonymized data, that can subsequently lead to uncovering the identity or private information. We perform reasonable experiments on a real-life benchmark dataset to prove the pitfalls of AI in the data publishing scenario (when a database is either fully or partially released to public domains for conducting analytics).
引用
收藏
页码:76177 / 76195
页数:19
相关论文
共 66 条
[1]  
Annamalai MSMS, 2024, Arxiv, DOI arXiv:2301.10053
[2]  
Bishop C., 2006, Pattern Recognition and Machine Learning
[3]   How Big Data and Artificial Intelligence Can Help Better Manage the COVID-19 Pandemic [J].
Bragazzi, Nicola Luigi ;
Dai, Haijiang ;
Damiani, Giovanni ;
Behzadifar, Masoud ;
Martini, Mariano ;
Wu, Jianhong .
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2020, 17 (09)
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Evaluation of Bias in Sensitive Personal Information Used to Train Financial Models [J].
Bryant, Reginald ;
Cintas, Celia ;
Wambugu, Isaac ;
Kinai, Andrew ;
Diriye, Abdigani ;
Weldemariam, Komminist .
2019 7TH IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (IEEE GLOBALSIP), 2019,
[6]   Privacy-Preserving and Traceable Federated Learning for data sharing in industrial IoT applications [J].
Chen, Junbao ;
Xue, Jingfeng ;
Wang, Yong ;
Huang, Lu ;
Baker, Thar ;
Zhou, Zhixiong .
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
[7]   Global Combination and Clustering Based Differential Privacy Mixed Data Publishing [J].
Chen, Lanxiang ;
Zeng, Lingfang ;
Mu, Yi ;
Chen, Leilei .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (11) :11437-11448
[8]  
Choi E., 2017, PMLR, P286, DOI DOI 10.48550/ARXIV.1703.06490
[9]  
Das HP, 2022, AAAI CONF ARTIF INTE, P11792
[10]   Background knowledge attacks in privacy-preserving data publishing models [J].
Desai, Nidhi ;
Das, Manik Lal ;
Chaudhari, Payal ;
Kumar, Naveen .
COMPUTERS & SECURITY, 2022, 122