Differentially Private Network Data Release via Structural Inference

被引:100
作者
Xiao, Qian [1 ]
Chen, Rui [2 ]
Tan, Kian-Lee [1 ,3 ]
机构
[1] Natl Univ Singapore, NGS, Singapore, Singapore
[2] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
[3] Natl Univ Singapore, Sch Comp, Singapore, Singapore
来源
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14) | 2014年
关键词
Network data; differential privacy; structural inference;
D O I
10.1145/2623330.2623642
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Information networks, such as social media and email networks, often contain sensitive information. Releasing such network data could seriously jeopardize individual privacy. Therefore, we need to sanitize network data before the release. In this paper, we present a novel data sanitization solution that infers a network's structure in a differentially private manner. We observe that, by estimating the connection probabilities between vertices instead of considering the observed edges directly, the noise scale enforced by differential privacy can be greatly reduced. Our proposed method infers the network structure by using a statistical hierarchical random graph (HRG) model. The guarantee of differential privacy is achieved by sampling possible HRG structures in the model space via Markov chain Monte Carlo (MCMC). We theoretically prove that the sensitivity of such inference is only O(log n), where n is the number of vertices in a network. This bound implies less noise to be injected than those of existing works. We experimentally evaluate our approach on four real-life network datasets and show that our solution effectively preserves essential network structural properties like degree distribution, shortest path length distribution and influential nodes.
引用
收藏
页码:911 / 920
页数:10
相关论文
共 28 条
  • [1] Blocki J., 2013, ITCS
  • [2] Brooks S, 2011, CH CRC HANDB MOD STA, pXIX
  • [3] Chen R., VLDBJ
  • [4] Chen S., 2013, SIGMOD
  • [5] Cheng J., 2010, SIGMOD
  • [6] Clauset A., 2007, ICML STAT NETWORK AN
  • [7] Hierarchical structure and the prediction of missing links in networks
    Clauset, Aaron
    Moore, Cristopher
    Newman, M. E. J.
    [J]. NATURE, 2008, 453 (7191) : 98 - 101
  • [8] Anonymizing Bipartite Graph Data using Safe Groupings
    Cormode, Graham
    Srivastava, Divesh
    Yu, Ting
    Zhang, Qing
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (01): : 833 - 844
  • [9] Calibrating noise to sensitivity in private data analysis
    Dwork, Cynthia
    McSherry, Frank
    Nissim, Kobbi
    Smith, Adam
    [J]. THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 : 265 - 284
  • [10] Gupta A, 2012, LECT NOTES COMPUT SC, V7194, P339, DOI 10.1007/978-3-642-28914-9_19