A Multi-Modal Vertical Federated Learning Framework Based on Homomorphic Encryption

被引:9
作者
Gong, Maoguo [1 ]
Zhang, Yuanqiao [1 ]
Gao, Yuan [1 ]
Qin, A. K. [2 ]
Wu, Yue [1 ]
Wang, Shanfeng [1 ]
Zhang, Yihong [1 ]
机构
[1] Xidian Univ, Key Lab Collaborat Intelligence Syst, Minist Educ, Xian 710071, Shaanxi, Peoples R China
[2] Swinburne Univ Technol, Dept Comp Technol, Melbourne, Vic 3122, Australia
基金
中国国家自然科学基金;
关键词
Vertical federated learning; universal frame-work; homomorphic encryption; bivariate Taylor series expansion; multi-modal learning; cross-domain semantic feature extraction;
D O I
10.1109/TIFS.2023.3340994
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Federated learning has gained prominence as an effective solution for addressing data silos, enabling collaboration among multiple parties without sharing their data. However, existing federated learning algorithms often neglect the challenge posed by multi-modal data distribution. Moreover, previous pioneering work face limitations in encrypting the exponential and logarithmic operations of the objective function with multiple independent variables, and they rely on a third-party cooperator for encryption. To address these limitations, this paper introduces a universal multi-modal vertical federated learning framework. To tackle the data distribution challenge, we propose a two-step multi-modal transformer model that captures cross-domain semantic features effectively. For encryption, where traditional additively homomorphic encryption algorithms fall short by supporting only addition and multiplication, we employ bivariate Taylor series expansion to transform the objective function. Integrating these components, we present a comprehensive training and transmission protocol that eliminates the need for a third-party cooperator during the encryption process. Extensive experiments conducted on diverse video-text and image-text datasets validate the superior performance of our framework compared to state-of-the-art approaches, affirming its effectiveness in multi-modal vertical federated learning settings.
引用
收藏
页码:1826 / 1839
页数:14
相关论文
共 46 条
  • [1] Multimodal Machine Learning: A Survey and Taxonomy
    Baltrusaitis, Tadas
    Ahuja, Chaitanya
    Morency, Louis-Philippe
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2019, 41 (02) : 423 - 443
  • [2] Barni M., 2006, P 8 WORKSHOP MULTIME, P146, DOI 10.1145/1161366.1161393
  • [3] Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs
    Bugliarello, Emanuele
    Cotterell, Ryan
    Okazaki, Naoaki
    Elliott, Desmond
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 978 - 994
  • [4] IEMOCAP: interactive emotional dyadic motion capture database
    Busso, Carlos
    Bulut, Murtaza
    Lee, Chi-Chun
    Kazemzadeh, Abe
    Mower, Emily
    Kim, Samuel
    Chang, Jeannette N.
    Lee, Sungbok
    Narayanan, Shrikanth S.
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2008, 42 (04) : 335 - 359
  • [5] Chen TY, 2020, Arxiv, DOI arXiv:2007.06081
  • [6] Chen T, 2020, PR MACH LEARN RES, V119
  • [7] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [8] Dowlin N, 2016, PR MACH LEARN RES, V48
  • [9] Du WL, 2004, SIAM PROC S, P222
  • [10] Elliptic curve Paillier schemes
    Galbraith, SD
    [J]. JOURNAL OF CRYPTOLOGY, 2002, 15 (02) : 129 - 138