A Comparative Study of Statistical and Machine Learning Methods for Solar Irradiance Forecasting Using the Folsom PLC Dataset
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The increasing penetration of photovoltaic solar energy has intensified the need for accurate production forecasting to ensure efficient grid operation. This study critically compares traditional statistical methods and machine learning approaches for forecasting solar irradiance using the benchmark Folsom PLC dataset. Two primary research questions are addressed: whether machine learning models outperform traditional techniques, and whether time series modelling improves prediction accuracy. The analysis evaluates a range of models—statistical regressions (OLS, LASSO, Ridge), regression trees, neural networks, and random forests—applied to physical modelling and time series approaches. Results reveal that while machine learning methods can outperform statistical models, particularly with the inclusion of exogenous weather features, they do not universally dominate across all forecasting horizons. Furthermore, pure time series approach models show lower performance. However, a hybrid approach integrating physical models with machine learning, demonstrates significantly improved accuracy. These findings highlight the value of hybrid models for photovoltaic forecasting and suggest strategic directions for operational implementation.