Fractional Brownian Motion for Benchmarking Machine Learning Algorithms in Non-linear Estimation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper introduces a novel benchmarking framework based on fractional Brownian motion (fBm) for evaluating advanced artificial intelligence (AI) and machine learning (ML) algorithms. The approach leverages the statistical structure of fBm to generate theoretically grounded, high-dimensional datasets of unlimited size, enabling reproducible and scalable model evaluation under controlled stochastic conditions. Rather than predicting the supremum or infimum of fBm paths directly, the proposed benchmark focuses on estimating a nonlinear functional derived from path extremes –a transformation that depends on the joint behavior of the supremum, infimum, and range of future segments. This formulation produces a complex, highly nonlinear target function that tests the limits of ML methods in functional estimation, particularly under noise, drift, and persistence variations. Comparisons across six ML architectures reveal a striking scaling paradox: while model performance is broadly comparable at small data sizes (N = 102), a sharp divergence emerges at scale (N = 104), where global mapping models (LR, MLP) converge to high predictive accuracy (R2 ≈ 0.96) while local and ensemble methods suffer a total generalization collapse to deeply negative R2 values. The benchmark is readily tunable through parameters including the Hurst exponent (H), drift (μ), and temporal horizon, enabling systematic stress-testing across both persistent and rough-path regimes. Six widely used ML algorithms were evaluated on this benchmark: Linear Regression (LR), Random Forest (RF), Gradient Boosting (GB), Support Vector Machines (SVM), Artificial Neural Networks (ANN), and k-Nearest Neighbours (kNN). Overall, the fBm-based benchmark highlights the intrinsic difficulty of learning nonlinear functionals of stochastic paths and provides a tunable, mathematically robust environment for assessing algorithmic robustness. This framework offers a rigorous, scalable testbed for advancing the development and comparison of ML methods under complex, noise-dominated dynamics.

Article activity feed