Towards Unsupervised Sudden Data Drift Detection in Federated Learning with Fuzzy Clustering

被引:0
|
作者
Stallmann, Morris [1 ]
Wilbik, Anna [1 ]
Weiss, Gerhard [1 ]
机构
[1] Maastricht Univ, Dept Adv Comp Sci, Maastricht, Netherlands
来源
2024 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ-IEEE 2024 | 2024年
关键词
federated learning; fuzzy clustering; unsupervised; drift; drift detection; federated drift detection; federated data drift detection; FCM;
D O I
10.1109/FUZZ-IEEE60900.2024.10611883
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Federated learning (FL) is a machine learning (ML) discipline that allows to train ML models on distributed data without revealing raw data instances. It promises to enable ML in environments with data sharing constraints, e.g., due to data privacy concerns, or other considerations. Data and concept drift are commonly referred to as unpredictable changes in data distributions over time. It is known to impact a ML model's performances in many real-world scenarios. While drift detection and adaptation has been studied extensively in the non-federated setting, it is still less explored in the FL setting. The private and distributed nature of data in FL makes drift detection much harder in FL since no entity can oversee all data instances to estimate changes in the global data distribution. In this paper, we propose a novel unsupervised federated data drift detection method that is based on federated fuzzy c-means clustering and the federated fuzzy Davies-Bouldin index, a global cluster validation metric. First, using the federated fuzzy c-means clustering algorithm, an initial global data model is learned. Second, the federated fuzzy Davies-Bouldin index . is calculated estimating how well the data fits the learned model. Third, whenever a new batch of data is available at time t, the fit of initial data model and new data is evaluated through the federated fuzzy Davies-Bouldin index Delta(t). Finally Delta and Delta(t) are compared to detect drift. The method is unsupervised as it does not require any labels and detects global data drift while keeping all data private. We evaluate our method carefully in a controlled environment by simulating multiple federated drift scenarios. We observe promising results as it rarely signals false positive alarms and detects drift in multiple scenarios. We also observe short-comings such as sensitivity to parameter choices and low detection rate in case only few data points in a new batch of data are affected by drift.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Unsupervised Drift Detection on High-speed Data Streams
    Souza, Vinicius M. A.
    Chowdhury, Farhan A.
    Mueen, Abdullah
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 102 - 111
  • [22] DriftGAN: Using historical data for Unsupervised Recurring Drift Detection
    Fellicious, Christofer
    Julka, Sahib
    Wendlinger, Lorenz
    Granitzer, Michael
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 368 - 369
  • [23] Unsupervised Data Splitting Scheme for Federated Edge Learning in IoT Networks
    Nour, Boubakr
    Cherkaoui, Soumaya
    IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022,
  • [24] Towards Boosting Federated Learning Convergence: A Computation Offloading & Clustering Approach
    AbdulRahman, Sawsan
    Bouachir, Ouns
    Otoum, Safa
    Mourad, Azzam
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 106 - 111
  • [25] Expand and Shrink: Federated Learning with Unlabeled Data Using Clustering
    Kumar, Ajit
    Singh, Ankit Kumar
    Ali, Syed Saqib
    Choi, Bong Jun
    SENSORS, 2023, 23 (23)
  • [26] Dynamic Clustering Federated Learning for Non-IID Data
    Chen, Ming
    Wu, Jinze
    Yin, Yu
    Huang, Zhenya
    Liu, Qi
    Chen, Enhong
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT III, 2022, 13606 : 119 - 131
  • [27] Clustering-Based Federated Learning for Heterogeneous IoT Data
    Li, Shumin
    Wei, Linna
    Zhang, Weidong
    Wu, Xuangou
    2023 IEEE INTERNATIONAL CONFERENCES ON INTERNET OF THINGS, ITHINGS IEEE GREEN COMPUTING AND COMMUNICATIONS, GREENCOM IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING, CPSCOM IEEE SMART DATA, SMARTDATA AND IEEE CONGRESS ON CYBERMATICS,CYBERMATICS, 2024, : 172 - 179
  • [28] Fuzzy clustering algorithms for unsupervised change detection in remote sensing images
    Ghosh, Ashish
    Mishra, Niladri Shekhar
    Ghosh, Susmita
    INFORMATION SCIENCES, 2011, 181 (04) : 699 - 715
  • [29] Privacy-Preserving Realization of Fuzzy Clustering and Fuzzy Modeling Through Vertical Federated Learning
    Zhu, Xiubin
    Wang, Dan
    Pedrycz, Witold
    Li, Zhiwu
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2024, 54 (02): : 915 - 924
  • [30] A Fuzzy Threshold Based Unsupervised Clustering Algorithm for Natural Data Exploration
    Thomas, Binu
    Raju, G.
    2010 INTERNATIONAL CONFERENCE ON NETWORKING AND INFORMATION TECHNOLOGY (ICNIT 2010), 2010, : 473 - 477