GECCO'2022 Symbolic Regression Competition: Post-analysis of the Operon Framework

被引:7
作者
Burlacu, Bogdan [1 ]
机构
[1] Univ Appl Sci Upper Austria, Heurist & Evolutionary Algorithms Lab, Hagenberg, Austria
来源
PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2023 COMPANION | 2023年
关键词
symbolic regression; overfitting; interpretability; model selection; minimum description length; bayesian information criterion; MODEL SELECTION;
D O I
10.1145/3583133.3596390
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Operon is a C++ framework for symbolic regression with the ability to perform local search by optimizing model coefficients using the Levenberg-Marquardt algorithm. This enhancement has proven to be effective in a variety of regression tasks. Operon took part in the Interpretable Symbolic Regression for Data Science hosted at the 2022 Genetic and Evolutionary Computation Conference, where it ranked overall 4(th) based on criteria of accuracy, simplicity as well as task-specific goals. Although accurate, the returned models were exceedingly complex and ranked poorly in terms of simplicity. In this paper, we investigate the application of the Minimum Description Length (MDL) principle for selecting models with a better compromise between accuracy and complexity from the final Pareto front returned by the algorithm. A new experiment on the synthetic track of the competition highlights the critical role played by model selection in algorithm performance. The MDL-enhanced approach obtains the best overall score and demonstrates excellent results on all synthetic tracks.
引用
收藏
页码:2412 / 2419
页数:8
相关论文
共 36 条
[1]   A Survey of Statistical Machine Learning Elements in Genetic Programming [J].
Agapitos, Alexandros ;
Loughran, Roisin ;
Nicolau, Miguel ;
Lucas, Simon ;
O'Neill, Michael ;
Brabazon, Anthony .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2019, 23 (06) :1029-1048
[2]   Optuna: A Next-generation Hyperparameter Optimization Framework [J].
Akiba, Takuya ;
Sano, Shotaro ;
Yanase, Toshihiko ;
Ohta, Takeru ;
Koyama, Masanori .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2623-2631
[3]  
Bartlett DJ, 2023, Arxiv, DOI arXiv:2211.11461
[4]  
Bergstra J, 2011, Advances in Neural Information Processing Systems, V24
[5]  
Borges Cruz E., 2010, P 12 ANN C GEN EV CO, P985, DOI [10.1145/1830483.1830662, DOI 10.1145/1830483.1830662]
[6]  
Brolos K, 2021, Arxiv, DOI [arXiv:2104.05417, DOI 10.48550/ARXIV.2104.05417, 10.48550/arXiv.2104.05417]
[7]  
Buitinck L., 2013, ECML PKDD WORKSHOP L
[8]  
Burlacu Bogdan, 2020, GECCO'20. Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion, P1562, DOI 10.1145/3377929.3398099
[9]   Comparison of model selection for regression [J].
Cherkassky, V ;
Ma, YQ .
NEURAL COMPUTATION, 2003, 15 (07) :1691-1714
[10]  
Cramer EY, 2021, medRxiv, DOI [10.1101/2021.11.04.21265886, 10.1101/2021.11.04.21265886, DOI 10.1101/2021.11.04.21265886V1, DOI 10.1101/2021.11.04.21265886, 10.1101/2021.11.04.21265886v1]