Privacy adversarial network: Representation learning for mobile data privacy

被引:36
作者
Liu S. [1 ]
Du J. [1 ]
Shrivastava A. [2 ]
Zhong L. [2 ,3 ]
机构
[1] Xidian University, School of Computer Science and Technology, Xi'an
[2] Rice University, Department of Computer Science, Houston, TX
[3] Rice University, Department of Electrical and Computer Engineering, Houston, TX
关键词
Budget control - Signal encoding - Learning systems - Risk assessment - Data mining;
D O I
10.1145/3369816
中图分类号
学科分类号
摘要
The remarkable success of machine learning has fostered a growing number of cloud-based intelligent services for mobile users. Such a service requires a user to send data, e.g. image, voice and video, to the provider, which presents a serious challenge to user privacy. To address this, prior works either obfuscate the data, e.g. add noise and remove identity information, or send representations extracted from the data, e.g. anonymized features. They struggle to balance between the service utility and data privacy because obfuscated data reduces utility and extracted representation may still reveal sensitive information. This work departs from prior works in methodology: we leverage adversarial learning to better balance between privacy and utility. We design a representation encoder that generates the feature representations to optimize against the privacy disclosure risk of sensitive information (a measure of privacy) by the privacy adversaries, and concurrently optimize with the task inference accuracy (a measure of utility) by the utility discriminator. The result is the privacy adversarial network (PAN), a novel deep model with the new training algorithm, that can automatically learn representations from the raw data. And the trained encoder can be deployed on the user side to generate representations that satisfy the task-defined utility requirements and the user-specified/agnostic privacy budgets. Intuitively, PAN adversarially forces the extracted representations to only convey information required by the target task. Surprisingly, this constitutes an implicit regularization that actually improves task accuracy. As a result, PAN achieves better utility and better privacy at the same time! We report extensive experiments on six popular datasets, and demonstrate the superiority of PAN compared with alternative methods reported in prior work. Copyright © 2019 held by the owner/author(s).
引用
收藏
相关论文
共 43 条
[1]  
Abadi M., Chu A., Goodfellow I., Brendan-Mcmahan H., Mironov I., Talwar K., Zhang L., Deep learning with differential privacy, Proceedings of SIGSAC, pp. 308-318, (2016)
[2]  
Bhatia J., Breaux T.D., Friedberg L., Hibshi H., Smullen D., Privacy risk in cybersecurity data sharing, Proceedings of ACM Workshop on ISCS, pp. 57-64, (2016)
[3]  
Chen J., Konrad J., Ishwar P., Vgan-based image representation learning for privacy-preserving facial expression recognition, Proceedings of CVPR Workshops, pp. 1570-1579, (2018)
[4]  
Chien J.-T., Chen C.-H., Deep discriminative manifold learning, Proceeding of ICASSP, pp. 2672-2676, (2016)
[5]  
Collette A., HDF5 for Python, (2018)
[6]  
Deng J., Dong W., Socher R., Li L.-J., Li K., Fei-Fei L., Imagenet: A large-scale hierarchical image database, Proceedings of CVPR, (2009)
[7]  
Dwork C., Naor M., Pitassi T., Rothblum G.N., Differential privacy under continual observation, Proceedings of STC, pp. 715-724, (2010)
[8]  
Dwork C., Roth A., Et al., The algorithmic foundations of differential privacy, Journal of Foundations and Trends in Theoretical Computer Science, pp. 211-407, (2014)
[9]  
Dwork C., Smith A., Steinke T., Ullman J., Exposed! a survey of attacks on private data, Annual Review of Statistics and Its Application, 4, pp. 61-84, (2017)
[10]  
Edwards H., Storkey A., Censoring Representations with An Adversary, (2015)