Statistical Tools for Symbolic Regression


While Symbolic Regression has demonstrated remarkable potential in terms of regression model accuracy, it currently lacks the comprehensive suite of statistical tools available for conventional regression analysis, especially in comparison to Generalized Linear Models. Many of these well-established statistical tools can be tailored to the context of Symbolic Regression, allowing us to derive confidence intervals for model parameters and predictions, perform functional decomposition, assess feature importance, and employ various other techniques to enhance our understanding of these models.

In the scope of this project, our aim is to implement and adapt such statistical tools, with a specific focus on extending the capabilities of the srtree-opt program. This program is capable of parsing and processing a multitude of symbolic regression models. Through these adaptations and enhancements, we intend to bridge the gap between Symbolic Regression and the extensive statistical toolkit available for traditional regression analysis, ultimately elevating the analytical capabilities in this domain.

Desired knowledges: statistics, programming (Haskell and C++), regression analysis

Related publications: