Privacy for Free: Spy Attack in Vertical Federated Learning by Both Active and Passive Parties

被引:0
作者
Fu, Chaohao [1 ]
Chen, Hongbin [1 ]
Ruan, Na [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Privacy; Data models; Federated learning; Collaboration; Data privacy; Companies; Computational modeling; Distributed databases; Predictive models; Vertical federated learning; free-rider attack; data reconstruction attack; privacy preservation;
D O I
10.1109/TIFS.2025.3534469
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Vertical federated learning (VFL) is an emerging paradigm well-suitable for commercial collaborations among companies. These companies share a common user base but possess distinct features. VFL enables the training of a shared global model with features from different parties while maintaining the confidentiality of raw data. Despite its potential, the VFL mechanism still lacks certified integrity, posing a notable threat of potential commercial deception or privacy infringement. In this study, we introduce a novel form of attack in which the attacker can participate in VFL by free-riding on the collaborative process while surreptitiously extracting users' private data. This attack, reminiscent of corporate espionage tactics, is called the "spy attack". Specifically, spy attacks allow a dishonest party without sufficient data to hitch a ride by inferring the missing user features through the shared information from other participants. We design two types of spy attacks tailored for scenarios where the attacker either takes an active or passive role. Evaluations with four real-world datasets demonstrate the effectiveness of our attacks, not only fulfilling the stipulated collaboration through hitchhiking, but also successfully stealing users' privacy. Even when the missing rate reaches 90%, the spy attack continues to yield a test accuracy that surpasses the model trained with non-missing data and achieves reconstruction results approaching the theoretically highest quality. Furthermore, we meticulously discuss and evaluate up to seven possible defense strategies. The findings underscore the necessity for designing more effective and efficient defense strategies to counteract spy attacks.
引用
收藏
页码:2550 / 2563
页数:14
相关论文
共 49 条
  • [41] Shokri R., Shmatikov V., Privacy-preserving deep learning, Proc. ACM SIGSAC Conf. Comput. Commun. Security, pp. 1310-1321, (2015)
  • [42] Geyer R.C., Klein T., Nabi M., Differentially private federated learning: A client level perspective, (2017)
  • [43] Hayes J., Mahloujifar S., Balle B., Bounding training data reconstruction in DP-SGD, Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 36, pp. 1-17, (2023)
  • [44] Szekely G.J., Rizzo M.L., Bakirov N.K., Measuring and testing dependence by correlation of distances, Ann. Statist., 35, 6, pp. 2769-2794, (2007)
  • [45] Vepakomma P., Swedish T., Raskar R., Gupta O., Dubey A., No peek: A survey of private distributed deep learning, (2018)
  • [46] Noorbakhsh S.L., Zhang B., Hong Y., Wang B., Inf<sup>2</sup>Guard: An information-theoretic framework for learning privacy-preserving representations against inference attacks, Proc. USENIX Secur. Symp. (USENIX Security), pp. 2405-2422, (2024)
  • [47] Gascon A., Et al., Privacy-preserving distributed linear regression on high-dimensional data, Proc. Privacy Enhancing Technol., 2017, 4, pp. 345-364, (2017)
  • [48] Sharma S., Xing C., Liu Y., Kang Y., Secure and efficient federated transfer learning, Proc. IEEE Int. Conf. Big Data, pp. 2569-2576, (2019)
  • [49] Gentry C., Fully homomorphic encryption using ideal lattices, Proc. 41st Annu. ACM Symp. Theory Comput. (STOC), pp. 169-178, (2009)