Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

被引:2
|
作者
Wu, Xidong [1 ]
Hu, Zhengmian [1 ]
Pei, Jian [2 ]
Huang, Heng [3 ]
机构
[1] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15260 USA
[2] Duke Univ, Dept Comp Sci, Durham, NC 27706 USA
[3] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
来源
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023 | 2023年
关键词
AUPRC; federated learning; imbalanced data; stochastic optimization; serverless federated learning;
D O I
10.1145/3580305.3599499
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To address the big data challenges, serverless multi-party collaborative training has recently attracted attention in the data mining community, since they can cut down the communications cost by avoiding the server node bottleneck. However, traditional serverless multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (e.g., cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although multiple single-machine methods have been designed to train models for AUPRC maximization, the algorithm for multi-party collaborative training has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. For example, existing single-machine-based AUPRC maximization algorithms maintain an inner state for local each data point, thus these methods are not applicable to large-scale multi-party collaborative training due to the dependence on each local data point. To address the above challenge, in this paper, we reformulate the serverless multi-party collaborative AUPRC maximization problem as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem. Finally, extensive experiments show the advantages of directly optimizing the AUPRC with distributed learning methods and also verify the efficiency of our new algorithms (i.e., SLATE and SLATE-M).
引用
收藏
页码:2648 / 2659
页数:12
相关论文
共 41 条
  • [21] Secure and efficient federated learning via novel multi-party computation and compressed sensing
    Chen, Lvjun
    Xiao, Di
    Yu, Zhuyang
    Zhang, Maolan
    INFORMATION SCIENCES, 2024, 667
  • [22] Privacy-preserving federated learning for collaborative medical data mining in multi-institutional settings
    Rahul Haripriya
    Nilay Khare
    Manish Pandey
    Scientific Reports, 15 (1)
  • [23] A Hybrid Federated Learning Framework With Dynamic Task Allocation for Multi-Party Distributed Load Prediction
    Liu, Haizhou
    Zhang, Xuan
    Shen, Xinwei
    Sun, Hongbin
    Shahidehpour, Mohammad
    IEEE TRANSACTIONS ON SMART GRID, 2023, 14 (03) : 2460 - 2472
  • [24] PrivatEyes: Appearance-based Gaze Estimation Using Federated Secure Multi-Party Computation
    Elfares M.
    Reisert P.
    Hu Z.
    Tang W.
    Küsters R.
    Bulling A.
    Proceedings of the ACM on Human-Computer Interaction, 2024, 8 (ETRA)
  • [25] Two-Phase Multi-Party Computation Enabled Privacy-Preserving Federated Learning
    Kanagavelu, Renuga
    Li, Zengxiang
    Samsudin, Juniarto
    Yang, Yechao
    Yang, Feng
    Goh, Rick Siow Mong
    Cheah, Mervyn
    Wiwatphonthana, Praewpiraya
    Akkarajitsakul, Khajonpong
    Wang, Shangguang
    2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 410 - 419
  • [26] A Privacy-Preserving Federated Learning Framework for IoT Environment Based on Secure Multi-party Computation
    Geng, Tieming
    Liu, Jian
    Huang, Chin-Tser
    2024 IEEE ANNUAL CONGRESS ON ARTIFICIAL INTELLIGENCE OF THING, AIOT 2024, 2024, : 117 - 122
  • [27] Privacy-Enhanced Pneumonia Diagnosis: IoT-Enabled Federated Multi-Party Computation in Industry 5.0
    Siddique, Ali Akbar
    Boulila, Wadii
    Alshehri, Mohammed S.
    Ahmed, Fawad
    Gadekallu, Thippa Reddy
    Victor, Nancy
    Qadri, M. Tahir
    Ahmad, Jawad
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 1923 - 1939
  • [28] Carbon Emissions Prediction and Optimization Method of Hobbing with Multi-source Data Collaborative Based on Federated Learning
    Yi, Qian
    Xu, Yan
    Li, Congbo
    Li, Chuanjiang
    Cao, Huajun
    INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING-GREEN TECHNOLOGY, 2025,
  • [29] Hybrid Algorithm Based on Simulated Annealing and Bacterial Foraging Optimization for Mining Imbalanced Data
    Lee, Chou-Yuan
    Lee, Zne-Jung
    Huang, Jian-Qiong
    Ye, Fu-Lan
    Yao, Jie
    Ning, Zheng-Yuan
    Meen, Teen-Hang
    SENSORS AND MATERIALS, 2021, 33 (04) : 1297 - 1312
  • [30] Multiparticipant Federated Feature Selection Algorithm With Particle Swarm Optimization for Imbalanced Data Under Privacy Protection
    Hu Y.
    Zhang Y.
    Gong D.
    Sun X.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (05): : 1002 - 1016