Privacy for Free: Spy Attack in Vertical Federated Learning by Both Active and Passive Parties

被引:0
作者
Fu, Chaohao [1 ]
Chen, Hongbin [1 ]
Ruan, Na [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Training; Privacy; Data models; Federated learning; Collaboration; Data privacy; Companies; Computational modeling; Distributed databases; Predictive models; Vertical federated learning; free-rider attack; data reconstruction attack; privacy preservation;
D O I
10.1109/TIFS.2025.3534469
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Vertical federated learning (VFL) is an emerging paradigm well-suitable for commercial collaborations among companies. These companies share a common user base but possess distinct features. VFL enables the training of a shared global model with features from different parties while maintaining the confidentiality of raw data. Despite its potential, the VFL mechanism still lacks certified integrity, posing a notable threat of potential commercial deception or privacy infringement. In this study, we introduce a novel form of attack in which the attacker can participate in VFL by free-riding on the collaborative process while surreptitiously extracting users' private data. This attack, reminiscent of corporate espionage tactics, is called the "spy attack". Specifically, spy attacks allow a dishonest party without sufficient data to hitch a ride by inferring the missing user features through the shared information from other participants. We design two types of spy attacks tailored for scenarios where the attacker either takes an active or passive role. Evaluations with four real-world datasets demonstrate the effectiveness of our attacks, not only fulfilling the stipulated collaboration through hitchhiking, but also successfully stealing users' privacy. Even when the missing rate reaches 90%, the spy attack continues to yield a test accuracy that surpasses the model trained with non-missing data and achieves reconstruction results approaching the theoretically highest quality. Furthermore, we meticulously discuss and evaluate up to seven possible defense strategies. The findings underscore the necessity for designing more effective and efficient defense strategies to counteract spy attacks.
引用
收藏
页码:2550 / 2563
页数:14
相关论文
共 49 条
  • [1] McMahan B., Moore E., Ramage D., Hampson S., Arcas B.A.Y., Communication-efficient learning of deep networks from decentralized data, Proc. 20th Int. Conf. Artif. Intell. Statist., 54, pp. 1273-1282, (2017)
  • [2] Brisimi T.S., Chen R., Mela T., Olshevskya A., Paschalidisa I.C., Shi W., Federated learning of predictive models from federated electronic health records, Int. J. Med. Inform., 112, pp. 59-67, (2018)
  • [3] Yang Q., Liu Y., Chen T., Tong Y., Federated machine learning: Concept and applications, ACM Trans. Intell. Syst. Technol., 10, 2, pp. 1-19, (2019)
  • [4] Cheng Y., Liu Y., Chen T., Yang Q., Federated learning for privacy-preserving AI, Commun. ACM, 63, 12, pp. 33-36, (2020)
  • [5] Mu Y., Deep leakage from gradients, Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 32, pp. 1-12, (2023)
  • [6] Geiping J., Bauermeister H., Droge H., Moeller M., Inverting gradients—How easy is it to break privacy in federated learning?, Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), 33, pp. 16937-16947, (2020)
  • [7] Yin H., Mallya A., Vahdat A., Alvarez J.M., Kautz J., Molchanov P., See through gradients: Image batch recovery via gradinversion, Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., pp. 16337-16346, (2021)
  • [8] Wu R., Chen X., Guo C., Weinberger K.Q., Learning to invert: Simple adaptive attacks for gradient inversion in federated learning, Proc. 39th Conf. Uncertainty Artif. Intell. (UAI), 216, pp. 2293-2303, (2022)
  • [9] Fowl L., Geiping J., Czaja W., Goldblum M., Goldstein T., Robbing the fed: Directly obtaining private data in federated learning with modified models, Proc. Int. Conf. Learn. Represent. (ICLR), pp. 1-16, (2021)
  • [10] Nasr M., Shokri R., Houmansadr A., Comprehensive privacy analysis of deep learning, Proc. 40th IEEE Symp. Secur. Privacy, 1, pp. 739-753, (2019)