An Efficient Federated Genetic Programming Framework for Symbolic Regression

被引:11
作者
Dong, Junlan [1 ]
Zhong, Jinghui [1 ]
Chen, Wei-Neng [1 ]
Zhang, Jun [2 ,3 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Zhejiang Normal Univ, Jinhua 321004, Zhejiang, Peoples R China
[3] Hanyang Univ, Ansan 15588, South Korea
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2023年 / 7卷 / 03期
基金
中国国家自然科学基金;
关键词
Federated genetic programming; mean shift aggregation; decentralized data; data privacy;
D O I
10.1109/TETCI.2022.3201299
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Symbolic regression is an important method of datadriven modeling, which can provide explicit mathematical expressions for data analysis. However, the existing genetic programming algorithms for symbolic regression require centralized storage of all data, which is unrealistic in many practical applications that involve data privacy. If the data comes from different sources, such as hospitals and banks, it is prone to privacy breaches and security issues. To this end, we propose an efficient federated genetic programming framework that can train a global model without integrated data. Each client can process decentralized data locally in parallel, without sending the original data to the server. This method not only protects the privacy of the data but also reduces the time required for data collection. Moreover, a mean shift aggregation mechanism is developed for aggregating local fitness. Considering the samples' relative importance, the mechanism improves the imbalance of symbolic regression data on real-life by incorporating weights into fitness function. Furthermore, based on this framework and self-learning gene expression programming (SL-GEP), a federated self-learning gene expression programming algorithm is developed. The experimental results show that, compared with standard SL-GEP which is a training model based on decentralized data only, our proposed federated genetic programming method is effective to protect data privacy and can have consistently better generalization performance.
引用
收藏
页码:858 / 871
页数:14
相关论文
共 40 条
  • [1] Al-Helali B., 2020, PROC IEEE C EVOL COM, P1
  • [2] Accelerated parallel genetic programming tree evaluation with OpenCL
    Augusto, Douglas A.
    Barbosa, Helio J. C.
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (01) : 86 - 100
  • [3] Communication efficiency optimization in federated learning based on multi-objective evolutionary algorithm
    Chai, Zheng-yi
    Yang, Chuan-dong
    Li, Ya-lun
    [J]. EVOLUTIONARY INTELLIGENCE, 2023, 16 (03) : 1033 - 1044
  • [4] Preserving Population Diversity Based on Transformed Semantics in Genetic Programming for Symbolic Regression
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2021, 25 (03) : 433 - 447
  • [5] Improving Generalization of Genetic Programming for Symbolic Regression With Angle-Driven Geometric Semantic Operators
    Chen, Qi
    Xue, Bing
    Zhang, Mengjie
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2019, 23 (03) : 488 - 502
  • [6] Chen Q, 2016, IEEE C EVOL COMPUTAT, P3793, DOI 10.1109/CEC.2016.7744270
  • [7] Parallel linear genetic programming for multi-class classification
    Downey, Carlton
    Zhang, Mengjie
    Liu, Jing
    [J]. GENETIC PROGRAMMING AND EVOLVABLE MACHINES, 2012, 13 (03) : 275 - 304
  • [8] Fernández F, 2000, LECT NOTES COMPUT SC, V1802, P283
  • [9] A scalable cellular implementation of parallel genetic programming
    Folino, G
    Pizzuti, C
    Spezzano, G
    [J]. IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2003, 7 (01) : 37 - 53
  • [10] A new multi-gene genetic programming approach to nonlinear system modeling. Part I: materials and structural engineering problems
    Gandomi, Amir Hossein
    Alavi, Amir Hossein
    [J]. NEURAL COMPUTING & APPLICATIONS, 2012, 21 (01) : 171 - 187