Applied Artificial Intelligence Benchmarks for Denoising Symbolic Regression Signal: First Systematic Mapping of Deterministic vs Bayesian Approaches

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

One of the limitations of symbolic regression models lies in their vulnerability to noise in the training data, leading to inaccurate trend prediction. Although Bayesian methods are considered the default solution, there are currently few benchmarks for effectiveness depending on the noise level and complexity of the relationship within the dataset. In order to establish those benchmarks, this work compares a stochastic non-causal Kalman smoother (RTS) with a deterministic smoothing spline method with various Gaussian noise levels. The latter denoising method is explored for the first time in the context of SR literature. The methods are tested on the recovery of 10 benchmark equations of varying difficulty with a classifying symbolic regression genetic programming algorithm. The results show that the splines outperform the Kalman method in low noise, close-to-linear environments, while Kalman is better for higher noise levels, and neither method consistently succeeds in complex, non-linear settings.

Article activity feed