The Value of Collaboration in Convex Machine Learning with Differential Privacy

被引:78
作者
Wu, Nan [1 ]
Farokhi, Farhad [2 ,3 ]
Smith, David [2 ,4 ]
Kaafar, Mohamed Ali [1 ,2 ]
机构
[1] Macquarie Univ, N Ryde, NSW, Australia
[2] CSIRO, Data61, Canberra, ACT, Australia
[3] Univ Melbourne, Melbourne, Vic 3010, Australia
[4] Australian Natl Univ, Canberra, ACT, Australia
来源
2020 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP 2020) | 2020年
关键词
Machine learning; Differential privacy; Stochastic gradient algorithm; REGRESSION;
D O I
10.1109/SP40000.2020.00025
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we apply machine learning to distributed private data owned by multiple data owners, entities with access to non-overlapping training datasets. We use noisy, differentially-private gradients to minimize the fitness cost of the machine learning model using stochastic gradient descent. We quantify the quality of the trained model, using the fitness cost, as a function of privacy budget and size of the distributed datasets to capture the trade-off between privacy and utility in machine learning. This way, we can predict the outcome of collaboration among privacy-aware data owners prior to executing potentially computationally-expensive machine learning algorithms. Particularly, we show that the difference between the fitness of the trained machine learning model using differentially-private gradient queries and the fitness of the trained machine model in the absence of any privacy concerns is inversely proportional to the size of the training datasets squared and the privacy budget squared. We successfully validate the performance prediction with the actual performance of the proposed privacy-aware learning algorithms, applied to: financial datasets for determining interest rates of loans using regression; and detecting credit card frauds using support vector machines.
引用
收藏
页码:304 / 317
页数:14
相关论文
共 34 条
  • [1] Deep Learning with Differential Privacy
    Abadi, Martin
    Chu, Andy
    Goodfellow, Ian
    McMahan, H. Brendan
    Mironov, Ilya
    Talwar, Kunal
    Zhang, Li
    [J]. CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, : 308 - 318
  • [2] [Anonymous], 2013, INTRO LECT CONVEX OP
  • [3] Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds
    Bassily, Raef
    Smith, Adam
    Thakurta, Abhradeep
    [J]. 2014 55TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS 2014), 2014, : 464 - 473
  • [4] Revisiting the governance of privacy: Contemporary policy instruments in global perspective
    Bennett, Colin J.
    Raab, Charles D.
    [J]. REGULATION & GOVERNANCE, 2020, 14 (03) : 447 - 464
  • [5] Practical Secure Aggregation for Privacy-Preserving Machine Learning
    Bonawitz, Keith
    Ivanov, Vladimir
    Kreuter, Ben
    Marcedone, Antonio
    McMahan, H. Brendan
    Patel, Sarvar
    Ramage, Daniel
    Segal, Aaron
    Seth, Karn
    [J]. CCS'17: PROCEEDINGS OF THE 2017 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2017, : 1175 - 1191
  • [6] Chaudhuri A, 2009, PRODUCT RESEARCH: THE ART AND SCIENCE BEHIND SUCCESSFUL PRODUCT LAUNCHES, P289, DOI 10.1007/978-90-481-2860-0_16
  • [7] Dowlin N, 2016, PR MACH LEARN RES, V48
  • [8] Du WL, 2004, SIAM PROC S, P222
  • [9] The Algorithmic Foundations of Differential Privacy
    Dwork, Cynthia
    Roth, Aaron
    [J]. FOUNDATIONS AND TRENDS IN THEORETICAL COMPUTER SCIENCE, 2013, 9 (3-4): : 211 - 406
  • [10] Graepel Thore, 2013, Information Security and Cryptology - ICISC 2012. 15th International Conference. Revised Selected Papers, P1, DOI 10.1007/978-3-642-37682-5_1