Federated Feature Selection for Horizontal Federated Learning in IoT Networks

被引:28
作者
Zhang, Xunzheng [1 ]
Mavromatis, Alex [1 ]
Vafeas, Antonis [1 ]
Nejabati, Reza [1 ]
Simeonidou, Dimitra [1 ]
机构
[1] Univ Bristol, Fac Engn, Sch Comp Sci Elect & Elect Engn & Engn Maths, Smart Internet Lab,High Performance Networks Grp, Bristol BS8 1QU, England
关键词
Internet of Things; Feature extraction; Distributed databases; Federated learning; Clustering algorithms; Training; Data models; Data cleaning; feature selection (FS); federated learning (FL); Internet of Things (IoT); unsupervised machine learning (ML); 6G; CHALLENGES; INTERNET; SUPPORT; THINGS;
D O I
10.1109/JIOT.2023.3237032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Under horizontal federated learning (HFL) in the Internet of Things (IoT) scenarios, different user data sets have significant similarities on the feature spaces, the final goal is to build a high-performance global model. However, not all features are great contributors when training the global HFL model, some features even impair the HFL. Besides, the curse of dimension will delay the training time and cause more energy consumption (EC). In this case, it is critical to remove irrelevant features from the local and select the useful overlapping features from a federated global perspective. In addition, the uncertainty of data being labeled and the nonindependent and identically distributed (non-IID) client data should also consider. This article introduces an unsupervised federated feature selection approach (named FSHFL) for HFL in IoT networks. First, a feature relevance outlier detection method is applied to the HFL participants to remove the useless features, which combines with the improved one-class support vector machine. Besides, a feature relevance hierarchical clustering (FRHC) algorithm is proposed for HFL overlapping feature selection. Experiment results on four IoT data sets show that the proposed methods can select better-federated feature sets among HFL participants, thus improving the performance of the HFL system. Specifically, the global model accuracy improves up to 1.68% since fewer irrelevant features. Moreover, FSHFL can lower the average training time as high as 6.9%. Finally, when the global model gets the same test accuracy, FSHFL can decrease the average EC of training the model by approximately 2.85% compared to federated average and roughly 68.39% compared to Fed-SGD.
引用
收藏
页码:10095 / 10112
页数:18
相关论文
共 55 条
  • [1] Comparative transcriptomics reveals candidate carotenoid color genes in an East African cichlid fish
    Ahi, Ehsan Pashay
    Lecaudey, Laurene A.
    Ziegelbecker, Angelika
    Steiner, Oliver
    Glabonjat, Ronald
    Goessler, Walter
    Hois, Victoria
    Wagner, Carina
    Lass, Achim
    Sefc, Kristina M.
    [J]. BMC GENOMICS, 2020, 21 (01)
  • [2] Banerjee S., 2021, CCIS, V1516, P480, DOI DOI 10.1007/978-3-030-92307-5_56
  • [3] Caldas S., 2018, arXiv
  • [4] Federated Feature Selection for Cyber-Physical Systems of Systems
    Cassara, Pietro
    Gotta, Alberto
    Valerio, Lorenzo
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (09) : 9937 - 9950
  • [5] Cost-Aware Feature Selection for IoT Device Classification
    Chakraborty, Biswadeep
    Divakaran, Dinil Mon
    Nevat, Ido
    Peters, Gareth W.
    Gurusamy, Mohan
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (14) : 11052 - 11064
  • [6] Local Adaptive Projection Framework for Feature Selection of Labeled and Unlabeled Data
    Chen, Xiaojun
    Yuan, Guowen
    Wang, Wenting
    Nie, Feiping
    Chang, Xiaojun
    Huang, Joshua Zhexue
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (12) : 6362 - 6373
  • [7] Credit Card Fraud Detection: A Realistic Modeling and a Novel Learning Strategy
    Dal Pozzolo, Andrea
    Boracchi, Giacomo
    Caelen, Olivier
    Alippi, Cesare
    Bontempi, Gianluca
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (08) : 3784 - 3797
  • [8] Intelligent IoT Traffic Classification Using Novel Search Strategy for Fast-Based-Correlation Feature Selection in Industrial Environments
    Egea, Santiago
    Rego Manez, Albert
    Carro, Belen
    Sanchez-Esguevillas, Antonio
    Lloret, Jaime
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2018, 5 (03): : 1616 - 1624
  • [9] Gao YS, 2020, Arxiv, DOI arXiv:2003.13376
  • [10] Toward 6G Networks: Use Cases and Technologies
    Giordani, Marco
    Polese, Michele
    Mezzavilla, Marco
    Rangan, Sundeep
    Zorzi, Michele
    [J]. IEEE COMMUNICATIONS MAGAZINE, 2020, 58 (03) : 55 - 61