TIDY: Publishing a Time Interval Dataset With Differential Privacy

被引:7
作者
Jung, Woohwan [1 ]
Kwon, Suyong [1 ]
Shim, Kyuseok [1 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul Gwanak POB 55, Seoul 08755, South Korea
基金
新加坡国家研究基金会;
关键词
Data privacy; Publishing; Histograms; Privacy; Time-frequency analysis; Two dimensional displays; Partitioning algorithms; Privacy-preserving data publishing; differential privacy; time interval dataset; WEB;
D O I
10.1109/TKDE.2019.2952351
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Log data from mobile devices generally contain a series of events with temporal information including time intervals which consist of the start and finish times. However, the problem of releasing differentially private time interval datasets has not been tackled yet. A time interval dataset can be represented by a two dimensional (2D) histogram. Most of the methods to publish 2D histograms partition the data into rectangular spaces to reduce the aggregated noise error for range queries. However, the existing algorithms to publish 2D histograms suffer from the structural error when applied to time interval datasets. To reduce the aggregated noise errors and suppress the increase in the structural error, we propose the TIDY (publishing Time Intervals via Differential privacY) algorithm. We use the frequency vectors as a compact representation of the time interval dataset. After applying the Laplace mechanism to the frequency vectors, we improve the utility of the frequency vectors based on a maximum likelihood estimation. We also develop a new partitioning method adapted for the frequency vectors to balance the trade-off between the noise and structural errors. Our empirical study on real-life and synthetic datasets confirms that TIDY outperforms the existing algorithms for 2D histograms.
引用
收藏
页码:2280 / 2294
页数:15
相关论文
共 43 条
[1]   Differentially Private Histogram Publishing through Lossy Compression [J].
Acs, Gergely ;
Castelluccia, Claude ;
Chen, Rui .
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, :1-10
[2]  
[Anonymous], 2011, P 17 ACM SIGKDD INT, DOI [DOI 10.1145/2020408.2020487, DOI 10.2217/14622416.6.6.639, 10.1145/2020408.2020487. (Cit. on p. 2, DOI 10.1145/2020408.2020487.(CIT.ONP.2, 10.1145]
[3]  
[Anonymous], 2011, P AUSTR C INF SEC PR
[4]   A Context-aware Time Model for Web Search [J].
Borisov, Alexey ;
Markov, Ilya ;
de Rijke, Maarten ;
Serdyukov, Pavel .
SIGIR'16: PROCEEDINGS OF THE 39TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2016, :205-214
[5]   Distributed optimization and statistical learning via the alternating direction method of multipliers [J].
Boyd S. ;
Parikh N. ;
Chu E. ;
Peleato B. ;
Eckstein J. .
Foundations and Trends in Machine Learning, 2010, 3 (01) :1-122
[6]  
Boyd Stephen P., 2014, CONVEX OPTIMIZATION
[7]   Correlated network data publication via differential privacy [J].
Chen, Rui ;
Fung, Benjamin C. M. ;
Yu, Philip S. ;
Desai, Bipin C. .
VLDB JOURNAL, 2014, 23 (04) :653-676
[8]  
Chen R, 2011, PROC VLDB ENDOW, V4, P1087
[9]  
Combi C., 2007, Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, P193
[10]   Differentially Private Spatial Decompositions [J].
Cormode, Graham ;
Procopiuc, Cecilia ;
Srivastava, Divesh ;
Shen, Entong ;
Yu, Ting .
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, :20-31