Utility-Privacy Tradeoffs in Databases: An Information-Theoretic Approach

被引:247
作者
Sankar, Lalitha [1 ]
Rajagopalan, S. Raj [2 ]
Poor, H. Vincent [3 ]
机构
[1] Princeton Univ, Princeton, NJ 08544 USA
[2] HP Labs, Princeton, NJ 08540 USA
[3] Princeton Univ, Dept Elect Engn, Princeton, NJ 08544 USA
基金
美国国家科学基金会;
关键词
Utility; privacy; databases; rate-distortion theory; equivocation; side information; RATE-DISTORTION FUNCTION; SIDE INFORMATION;
D O I
10.1109/TIFS.2013.2253320
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Ensuring the usefulness of electronic data sources while providing necessary privacy guarantees is an important unsolved problem. This problem drives the need for an analytical framework that can quantify the privacy of personally identifiable information while still providing a quantifiable benefit (utility) to multiple legitimate information consumers. This paper presents an information-theoretic framework that promises an analytical model guaranteeing tight bounds of how much utility is possible for a given level of privacy and vice-versa. Specific contributions include: 1) stochastic data models for both categorical and numerical data; 2) utility-privacy tradeoff regions and the encoding (sanization) schemes achieving them for both classes and their practical relevance; and 3) modeling of prior knowledge at the user and/or data source and optimal encoding schemes for both cases.
引用
收藏
页码:838 / 852
页数:15
相关论文
共 24 条
  • [1] Alvim MS, 2011, LECT NOTES COMPUT SC, V6756, P60, DOI 10.1007/978-3-642-22012-8_4
  • [2] [Anonymous], P 20 S PRINC DAT SYS
  • [3] Chawla S., 2005, P 21 C UNC ART INT E
  • [4] Cover T.M., 2006, ELEMENTS INFORM THEO, V2nd ed
  • [5] Dobra A., 2000, ASSESSING RISK DISCL, V7, P125
  • [6] Dwork C, 2011, FIRM FDN PRIVATE DAT
  • [7] Dwork C., 2006, PROC ICALP
  • [8] Fawaz N., 2012, P 50 ANN ALL C COMM
  • [9] Jagannathan G., 2009, P ICDM INT WORKSH PR
  • [10] Random-data perturbation techniques and privacy-preserving data mining
    Kargupta, H
    Datta, S
    Wang, Q
    Sivakumar, K
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 7 (04) : 387 - 414