Private Analysis of Graph Structure

被引:140
作者
Karwa, Vishesh [1 ]
Raskhodnikova, Sofya [2 ]
Smith, Adam [2 ]
Yaroslavtsev, Grigory [2 ]
机构
[1] Penn State Univ, Dept Stat, University Pk, PA 16802 USA
[2] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2014年 / 39卷 / 03期
基金
美国国家科学基金会;
关键词
Statistical data privacy; social network analysis; graph algorithms; differential privacy; MODELS; NOISE;
D O I
10.1145/2611523
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We present efficient algorithms for releasing useful statistics about graph data while providing rigorous privacy guarantees. Our algorithms work on datasets that consist of relationships between individuals, such as social ties or email communication. The algorithms satisfy edge differential privacy, which essentially requires that the presence or absence of any particular relationship be hidden. Our algorithms output approximate answers to subgraph counting queries. Given a query graph H, for example, a triangle, k-star, or k-triangle, the goal is to return the number of edge-induced isomorphic copies of H in the input graph. The special case of triangles was considered by Nissim et al. [2007] and a more general investigation of arbitrary query graphs was initiated by Rastogi et al. [2009]. We extend the approach of Nissim et al. to a new class of statistics, namely k-star queries. We also give algorithms for k-triangle queries using a different approach based on the higher-order local sensitivity. For the specific graph statistics we consider (i.e., k-stars and k-triangles), we significantly improve on the work of Rastogi et al.: our algorithms satisfy a stronger notion of privacy that does not rely on the adversary having a particular prior distribution on the data, and add less noise to the answers before releasing them. We evaluate the accuracy of our algorithms both theoretically and empirically, using a variety of real and synthetic datasets. We give explicit, simple conditions under which these algorithms add a small amount of noise. We also provide the average-case analysis in the Erdos-Renyi-Gilbert G(n, p) random graph model. Finally, we give hardness results indicating that the approach Nissim et al. used for triangles cannot easily be extended to k-triangles (hence justifying our development of a new algorithmic approach).
引用
收藏
页数:33
相关论文
共 36 条
  • [11] Calibrating noise to sensitivity in private data analysis
    Dwork, Cynthia
    McSherry, Frank
    Nissim, Kobbi
    Smith, Adam
    [J]. THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 : 265 - 284
  • [12] Dwork C, 2009, ACM S THEORY COMPUT, P371
  • [13] Dwork C, 2009, LECT NOTES COMPUT SC, V5444, P496
  • [14] Gehrke J, 2011, LECT NOTES COMPUT SC, V6597, P432, DOI 10.1007/978-3-642-19571-6_26
  • [15] A Survey of Statistical Network Models
    Goldenberg, Anna
    Zheng, Alice X.
    Fienberg, Stephen E.
    Airoldi, Edoardo M.
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2010, 2 (02): : 129 - 233
  • [16] Gupta A, 2012, LECT NOTES COMPUT SC, V7194, P339, DOI 10.1007/978-3-642-28914-9_19
  • [17] Accurate Estimation of the Degree Distribution of Private Networks
    Hay, Michael
    Li, Chao
    Miklau, Gerome
    Jensen, David
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 169 - 178
  • [18] Curved exponential family models for social networks
    Hunter, David R.
    [J]. SOCIAL NETWORKS, 2007, 29 (02) : 216 - 230
  • [19] Karwa Vishesh, 2012, Privacy in Statistical Databases. UNESCO Chair in Data Privacy. Proceedings of the International Conference, PSD 2012, P273, DOI 10.1007/978-3-642-33627-0_21
  • [20] Private Analysis of Graph Structure
    Karwa, Vishesh
    Raskhodnikova, Sofya
    Smith, Adam
    Yaroslavtsev, Grigory
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2014, 39 (03):