Preserving Model Privacy for Machine Learning in Distributed Systems

被引：38

作者：

Jia, Qi ^{[1
]}

Guo, Linke ^{[1
]}

Jin, Zhanpeng ^{[1
]}

Fang, Yuguang ^{[2
]}

机构：

[1] SUNY Binghamton, Dept Elect & Comp Engn, Binghamton, NY 13902 USA

[2] Univ Florida, Dept Elect & Comp Engn, Gainesville, FL 32611 USA

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2018年 / 29卷 / 08期

基金：

美国国家科学基金会;

关键词：

Machine learning; privacy preservation; data classification; model evaluation; AUTHENTICATION SYSTEM; NETWORKS;

D O I：

10.1109/TPDS.2018.2809624

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Machine Learning based data classification is a widely used data mining technique. By learning massive data collected from the real world, data classification helps learners discover hidden data patterns. These hidden data patterns are represented by the learned model in different machine learning schemes. Based on such models, a user can classify whether the new incoming data belongs to an existing class; or, multiple entities may test the similarity of their datasets. However, due to data locality and privacy concerns, it is infeasible for large-scale distributed systems to share each individual's datasets for classifying or testing. On the one hand, the learned model is an entity's private asset and may leak private information, which should be well protected from all other non-collaborative entities. On the other hand, the new incoming data may contain sensitive information which cannot be disclosed directly for classification. To address the above privacy issues, we propose an approach to preserve the model privacy of the data classification and similarity evaluation for distributed systems. With our scheme, neither new data nor learned models are directly revealed during the classification and similarity evaluation procedures. Based on extensive real-world experiments, we have evaluated the privacy preservation, feasibility, and efficiency of the proposed scheme.

引用

页码：1808 / 1822

页数：15

共 44 条

[1] Aggarwal Charu C, 2008, A general survey of privacy-preserving data mining models and algorithms
[2] Alpaydin E, 2014, ADAPT COMPUT MACH LE, P115
[3] [Anonymous], 2017, DATA LEAKAGE HEALTHC
[4] [Anonymous], 2007, Tech. rep
[5] Machine Learning Classification over Encrypted Data
Bost, Raphael
Popa, Raluca Ada
Tu, Stephen
Goldwasser, Shafi
[J]. 22ND ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2015), 2015,
[6] Distributed optimization and statistical learning via the alternating direction method of multipliers
Boyd S.
Parikh N.
Chu E.
Peleato B.
Eckstein J.
[J]. Foundations and Trends in Machine Learning, 2010, 3 (01): : 1 - 122
[7] LIBSVM: A Library for Support Vector Machines
Chang, Chih-Chung
Lin, Chih-Jen
[J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[8] Chu CK, 2005, LECT NOTES COMPUT SC, V3386, P172
[9] Forero PA, 2010, J MACH LEARN RES, V11, P1663
[10] Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures
Fredrikson, Matt
Jha, Somesh
Ristenpart, Thomas
[J]. CCS'15: PROCEEDINGS OF THE 22ND ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2015, : 1322 - 1333

← 1 2 3 4 5 →