An Efficient Federated Genetic Programming Framework for Symbolic Regression

被引:11
作者
Dong, Junlan [1 ]
Zhong, Jinghui [1 ]
Chen, Wei-Neng [1 ]
Zhang, Jun [2 ,3 ]
机构
[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Zhejiang Normal Univ, Jinhua 321004, Zhejiang, Peoples R China
[3] Hanyang Univ, Ansan 15588, South Korea
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2023年 / 7卷 / 03期
基金
中国国家自然科学基金;
关键词
Federated genetic programming; mean shift aggregation; decentralized data; data privacy;
D O I
10.1109/TETCI.2022.3201299
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Symbolic regression is an important method of datadriven modeling, which can provide explicit mathematical expressions for data analysis. However, the existing genetic programming algorithms for symbolic regression require centralized storage of all data, which is unrealistic in many practical applications that involve data privacy. If the data comes from different sources, such as hospitals and banks, it is prone to privacy breaches and security issues. To this end, we propose an efficient federated genetic programming framework that can train a global model without integrated data. Each client can process decentralized data locally in parallel, without sending the original data to the server. This method not only protects the privacy of the data but also reduces the time required for data collection. Moreover, a mean shift aggregation mechanism is developed for aggregating local fitness. Considering the samples' relative importance, the mechanism improves the imbalance of symbolic regression data on real-life by incorporating weights into fitness function. Furthermore, based on this framework and self-learning gene expression programming (SL-GEP), a federated self-learning gene expression programming algorithm is developed. The experimental results show that, compared with standard SL-GEP which is a training model based on decentralized data only, our proposed federated genetic programming method is effective to protect data privacy and can have consistently better generalization performance.
引用
收藏
页码:858 / 871
页数:14
相关论文
共 40 条
  • [11] Distributed evolutionary algorithms and their models: A survey of the state-of-the-art
    Gong, Yue-Jiao
    Chen, Wei-Neng
    Zhan, Zhi-Hui
    Zhang, Jun
    Li, Yun
    Zhang, Qingfu
    Li, Jing-Jing
    [J]. APPLIED SOFT COMPUTING, 2015, 34 : 286 - 300
  • [12] Parallel Genetic Algorithms: A Useful Survey
    Harada, Tomohiro
    Alba, Enrique
    [J]. ACM COMPUTING SURVEYS, 2020, 53 (04)
  • [13] Harding S., 2009, WPABA'09: Proceedings of the Second International Workshop on Parallel Architectures and Bioinspired Algorithms (WPABA 2009), P1
  • [14] A fast parallel genetic programming framework with adaptively weighted primitives for symbolic regression
    Huang, Zhixing
    Zhong, Jinghui
    Feng, Liang
    Mei, Yi
    Cai, Wentong
    [J]. SOFT COMPUTING, 2020, 24 (10) : 7523 - 7539
  • [15] Icke I, 2013, 2013 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), P1763
  • [16] Kammerer Lukas, 2021, GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference Companion, P251, DOI 10.1145/3449726.3459486
  • [17] Klein J, 2007, GECCO 2007: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, P1628
  • [18] Koza John R., 2005, Search methodologies, P127
  • [19] Li M, 2014, ADV NEUR IN, V27
  • [20] Lian XR, 2017, ADV NEUR IN, V30