Dimensionally-consistent equation discovery through probabilistic attribute grammars

被引:4
作者
Brence, Jure [1 ,2 ]
Dzeroski, Saso [1 ,2 ]
Todorovski, Ljupco [1 ,3 ]
机构
[1] Jozef Stefan Inst, Dept Knowledge Technol, Jamova Cesta 39, Ljubljana 1000, Slovenia
[2] Jozef Stefan Int Postgrad Sch, Jamova Cesta 39, Ljubljana 1000, Slovenia
[3] Univ Ljubljana, Fac Math & Phys, Dept Math, Jadranska 21, Ljubljana 1000, Slovenia
关键词
Equation discovery; Symbolic regression; Dimensional analysis; Units of measurement; Background knowledge; Computational scientific discovery; PHYSICALLY SIMILAR SYSTEMS; DRIVEN;
D O I
10.1016/j.ins.2023.03.073
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Equation discovery, also known as symbolic regression, is a machine learning task of inducing closed-form equations from data and background knowledge. The latter takes various forms. Domain-specific knowledge can constrain the space of candidate equations to those that make sense in the scientific or engineering domain of use. Cross-domain knowledge, on the other hand, imposes general rules for model acceptability, such as parsimony, understandability, or consistency of the equations with the dimensional units of the variables. In this paper, we propose using attribute grammars to ensure the induced equations' dimensional consistency. Attribute grammars are flexible enough to combine cross-domain knowledge on dimensional consistency with domain-specific knowledge expressed as a probabilistic context-free grammar. At the same time, we show that attribute grammars can be efficiently transformed into probabilistic context -free grammars for equation discovery with existing algorithms. Finally, we provide empirical evidence that attribute grammars ensuring dimensional consistency of equations can significantly improve the performance of equation discovery on the standard set of a hundred Feynman benchmarks.
引用
收藏
页码:742 / 756
页数:15
相关论文
共 33 条
[1]  
Bakarji J, 2022, Arxiv, DOI [arXiv:2202.04643, 10.48550/arXiv.2202.04643]
[2]  
Barenblatt G., 2003, Scaling
[3]   Probabilistic grammars for equation discovery [J].
Brence, Jure ;
Todorovski, Ljupco ;
Dzeroski, Sago .
KNOWLEDGE-BASED SYSTEMS, 2021, 224
[4]   Inductive process modeling [J].
Bridewell, Will ;
Langley, Pat ;
Todorovski, Ljupco ;
Dzeroski, Saso .
MACHINE LEARNING, 2008, 71 (01) :1-32
[5]   Two Kinds of Knowledge in Scientific Discovery [J].
Bridewell, Will ;
Langley, Pat .
TOPICS IN COGNITIVE SCIENCE, 2010, 2 (01) :36-52
[6]   Discovering governing equations from data by sparse identification of nonlinear dynamical systems [J].
Brunton, Steven L. ;
Proctor, Joshua L. ;
Kutz, J. Nathan .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (15) :3932-3937
[7]   On physically similar systems, illustrations of the use of dimensional equations [J].
Buckingham, E .
PHYSICAL REVIEW, 1914, 4 (04) :345-376
[8]  
Crochepierre L., 2022, arXiv
[9]  
Deransart P., 1990, LECT NOTES COMPUT<D>, V461
[10]   Fitness Landscape Analysis of Dimensionally-Aware Genetic Programming Featuring Feynman Equations [J].
Durasevic, Marko ;
Jakobovic, Domagoj ;
Martins, Marcella Scoczynski Ribeiro ;
Picek, Stjepan ;
Wagner, Markus .
PARALLEL PROBLEM SOLVING FROM NATURE - PPSN XVI, PT II, 2020, 12270 :111-124