Defending Against Data Poisoning Attacks: From Distributed Learning to Federated Learning

被引:4
作者
Tian, Yuchen [1 ]
Zhang, Weizhe [1 ]
Simpson, Andrew [2 ]
Liu, Yang [1 ]
Jiang, Zoe Lin [1 ]
机构
[1] Harbin Inst Technol Shenzhen, Coll Comp Sci & Technol, Shenzhen 518055, Peoples R China
[2] Univ Oxford, Dept Comp Sci, Oxford OX1 3QD, England
关键词
distributed learning; federated learning; data poisoning attacks; AI security;
D O I
10.1093/comjnl/bxab192
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Federated learning (FL), a variant of distributed learning (DL), supports the training of a shared model without accessing private data from different sources. Despite its benefits with regard to privacy preservation, FL's distributed nature and privacy constraints make it vulnerable to data poisoning attacks. Existing defenses, primarily designed for DL, are typically not well adapted to FL. In this paper, we study such attacks and defenses. In doing so, we start from the perspective of DL and then give consideration to a real-world FL scenario, with the aim being to explore the requisites of a desirable defense in FL. Our study shows that (i) the batch size used in each training round affects the effectiveness of defenses in DL, (ii) the defenses investigated are somewhat effective and moderately influenced by batch size in FL settings and (iii) the non-IID data makes it more difficult to defend against data poisoning attacks in FL. Based on the findings, we discuss the key challenges and possible directions in defending against such attacks in FL. In addition, we propose detect and suppress the potential outliers(DSPO), a defense against data poisoning attacks in FL scenarios. Our results show that DSPO outperforms other defenses in several cases.
引用
收藏
页码:711 / 726
页数:16
相关论文
共 45 条
[21]  
Mhamdi E.M.E., 2018, 35 INT C MACH LEARN, P5674
[22]  
Munoz-Gonzalez L., 2017, P 10 ACM WORKSH ART, P27
[23]   Poisoning Attacks in Federated Learning: An Evaluation on Traffic Sign Classification [J].
Nuding, Florian ;
Mayer, Rudolf .
PROCEEDINGS OF THE TENTH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, CODASPY 2020, 2020, :168-170
[24]  
Paudice Andrea, 2019, ECML PKDD 2018 Workshops. Nemesis 2018, UrbReas 2018, SoGood 2018 IWAISe 2018, and Green Data Mining 2018. Proceedings: Lecture Notes in Artificial Intelligence (LNAI 11329), P5, DOI 10.1007/978-3-030-13453-2_1
[25]  
Rosenfeld Elan, 2020, PMLR, P8230
[26]   Multi-Objective Optimization Techniques for Task Scheduling Problem in Distributed Systems [J].
Sarathambekai, S. ;
Umamaheswari, K. .
COMPUTER JOURNAL, 2018, 61 (02) :248-263
[27]  
Sattler F, 2020, INT CONF ACOUST SPEE, P8861, DOI [10.1109/icassp40776.2020.9054676, 10.1109/ICASSP40776.2020.9054676]
[28]  
Shafahi A, 2018, ADV NEUR IN, V31
[29]   Computational intelligence intrusion detection techniques in mobile cloud computing environments: Review, taxonomy, and open research issues [J].
Shamshirband, Shahab ;
Fathi, Mahdis ;
Chronopoulos, Anthony T. ;
Montieri, Antonio ;
Palumbo, Fabio ;
Pescape, Antonio .
JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2020, 55
[30]   Manipulating the Byzantine: Optimizing Model Poisoning Attacks and Defenses for Federated Learning [J].
Shejwalkar, Virat ;
Houmansadr, Amir .
28TH ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2021), 2021,