An Information-Theoretic Approach to Individual Sequential Data Sanitization

被引:9
作者
Bonomi, Luca [1 ]
Fan, Liyue [2 ]
Jin, Hongxia [3 ]
机构
[1] Univ Calif San Diego, La Jolla, CA 92093 USA
[2] Univ Southern Calif, Los Angeles, CA USA
[3] Samsung Res Amer, San Jose, CA USA
来源
PROCEEDINGS OF THE NINTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'16) | 2016年
关键词
Data Sanitization; Sequential Patterns; Mutual Information;
D O I
10.1145/2835776.2835828
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained, personal data has been largely, continuously generated nowadays, such as location check-ins, web histories, physical activities, etc. Those data sequences are typically shared with untrusted parties for data analysis and promotional services. However, the individually-generated sequential data contains behavior patterns and may disclose sensitive information if not properly sanitized. Furthermore, the utility of the released sequence can be adversely affected by sanitization techniques. In this paper, we study the problem of individual sequence data sanitization with minimum utility loss, given user-specified sensitive patterns. We propose a privacy notion based on information theory and sanitize sequence data via generalization. We show the optimization problem is hard and develop two efficient heuristic solutions. Extensive experimental evaluations are conducted on real-world datasets and the results demonstrate the efficiency and effectiveness of our solutions.
引用
收藏
页码:337 / 346
页数:10
相关论文
共 30 条
  • [1] Hiding Sequential and Spatiotemporal Patterns
    Abul, Osman
    Bonchi, Francesco
    Giannotti, Fosca
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (12) : 1709 - 1723
  • [2] [Anonymous], 2012, Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
  • [3] A Two-Phase Algorithm for Mining Sequential Patterns with Differential Privacy
    Bonomi, Luca
    Xiong, Li
    [J]. PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 269 - 278
  • [4] Calmon FD, 2012, ANN ALLERTON CONF, P1401, DOI 10.1109/Allerton.2012.6483382
  • [5] rho-uncertainty: Inference-Proof Transaction Anonymization
    Cao, Jianneng
    Karras, Panagiotis
    Raissi, Chedy
    Tan, Kian-Lee
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01): : 1033 - 1044
  • [6] Chen R., 2012, Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, P213, DOI DOI 10.1145/2339530.2339564
  • [7] Local Privacy and Statistical Minimax Rates
    Duchi, John C.
    Jordan, Michael I.
    Wainwright, Martin J.
    [J]. 2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 429 - 438
  • [8] Duhigg C, 2012, The New York Times
  • [9] Dwork C, 2006, LECT NOTES COMPUT SC, V4052, P1
  • [10] Reality mining: sensing complex social systems
    Eagle, Nathan
    Pentland, Alex
    [J]. PERSONAL AND UBIQUITOUS COMPUTING, 2006, 10 (04) : 255 - 268