Rapid Re-Identification Risk Assessment for Anonymous Data Set in Mobile Multimedia Scene

被引:5
|
作者
Yang, Zhigang [1 ,2 ,3 ,4 ]
Wang, Ruyan [1 ,3 ,4 ]
Luo, Daizhong [2 ]
Xiong, Yu [2 ]
机构
[1] Chongqing Univ Posts & Telecommun, Sch Commun & Informat Engn, Chongqing 400065, Peoples R China
[2] Chongqing Univ Arts & Sci, Sch Artificial Intelligence, Chongqing 402160, Peoples R China
[3] Key Lab Opt Commun & Networks, Chongqing 400065, Peoples R China
[4] Key Lab Ubiquitous Sensing & Networking, Chongqing 400065, Peoples R China
基金
中国国家自然科学基金;
关键词
Data privacy; Data models; Trajectory; Risk management; Couplings; Privacy; Multimedia systems; Multimedia; privacy; overall re-identification risk; attribute dependency; DE-ANONYMIZATION;
D O I
10.1109/ACCESS.2020.2977404
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ubiquitous mobile multimedia applications bring great convenience to users. However, when enjoying mobile multimedia services, users provide personal data to service platforms. Although the service platforms always claim that the collected personal data are de-identified, the risk of re-identifying users through linkage attacks still exists and is incalculable. This paper proposes a rapid prediction model for the overall re-identification risk based on the statistics of data sets (i.e., the number of individuals, number of attributes, distribution of attribute values, and attribute dependency). Our proposed model reveals the impact of statistics on the overall re-identification risk and adopts random sampling and semi-random sampling methods to predict the overall re-identification risk of data sets with and without strong dependency ordered attribute pairs. Experimental results show that for the data sets without strong dependency ordered attribute pairs, the random sampling method has a high prediction accuracy (the prediction error is less than 0.05). For the data sets with strong dependency ordered attribute pairs, the semi-random sampling method has a high prediction accuracy (the prediction error is less than 0.09). Exploiting our model, governments and individuals can quickly assess the privacy leakage risk of their data sets, given only the statistic of the data sets. Besides, this model can also evaluate the privacy risk of data collection schemes in advance according to historical statistics, and identify suspected services.
引用
收藏
页码:41557 / 41565
页数:9
相关论文
共 50 条
  • [1] Enabling realistic health data re-identification risk assessment through adversarial modeling
    Xia, Weiyi
    Liu, Yongtai
    Wan, Zhiyu
    Vorobeychik, Yevgeniy
    Kantacioglu, Murat
    Nyemba, Steve
    Clayton, Ellen Wright
    Malin, Bradley A.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (04) : 744 - 752
  • [2] Estimating the re-identification risk of clinical data sets
    Dankar, Fida Kamal
    El Emam, Khaled
    Neisa, Angelica
    Roffey, Tyson
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12
  • [3] Risk of re-identification of epigenetic methylation data: a more nuanced response is needed
    Yann Joly
    Stephanie OM Dyke
    Warren A Cheung
    Mark A Rothstein
    Tomi Pastinen
    Clinical Epigenetics, 2015, 7
  • [4] A Re-identification Risk-based Anonymization Framework for Data Analytics Platforms
    Silva, Hebert
    Basso, Tania
    Moraes, Regina
    Elia, Donatello
    Fiore, Sandro
    2018 14TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE (EDCC 2018), 2018, : 101 - 106
  • [5] Risk of re-identification of epigenetic methylation data: a more nuanced response is needed
    Joly, Yann
    Dyke, Stephanie O. M.
    Cheung, Warren A.
    Rothstein, Mark A.
    Pastinen, Tomi
    CLINICAL EPIGENETICS, 2015, 7
  • [6] Re-Identification Risk Based Security Controls
    Di Cerbo, Francesco
    Trabelsi, Slim
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2014 WORKSHOPS, 2014, 8842 : 99 - 107
  • [7] Effect of Data Degradation on Motion Re-Identification
    Nair, Vivek
    Miller, Mark Roman
    Wang, Rui
    Huang, Brandon
    Rack, Christian
    Latoschik, Marc Erich
    O'Brien, James F.
    PROCEEDINGS 2024 IEEE 25TH INTERNATIONAL SYMPOSIUM ON A WORLD OF WIRELESS, MOBILE AND MULTIMEDIA NETWORKS, WOWMOM 2024, 2024, : 85 - 90
  • [8] Estimating Re-identification Risk by Means of Formal Conceptualization
    Aranda-Corral, Gonzalo A.
    Borrego-Diaz, Joaquin
    Galan-Paez, Juan
    14TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN SECURITY FOR INFORMATION SYSTEMS AND 12TH INTERNATIONAL CONFERENCE ON EUROPEAN TRANSNATIONAL EDUCATIONAL (CISIS 2021 AND ICEUTE 2021), 2022, 1400 : 13 - 22
  • [9] A Game Theoretic Framework for Analyzing Re-Identification Risk
    Wan, Zhiyu
    Vorobeychik, Yevgeniy
    Xia, Weiyi
    Clayton, Ellen Wright
    Kantarcioglu, Murat
    Ganta, Ranjit
    Heatherly, Raymond
    Malin, Bradley A.
    PLOS ONE, 2015, 10 (03):
  • [10] Resisting re-identification mining on social graph data
    Jianliang Gao
    Qing Ping
    Jianxin Wang
    World Wide Web, 2018, 21 : 1759 - 1771