From Feature Selection to Forecasting: A Two-Stage Hybrid Framework for Food Price Prediction Using Economic Indicators in Türkiye
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study develops a comprehensive two-stage hybrid framework to forecast food prices in Türkiye, addressing inflation prediction challenges in volatile emerging markets where sample sizes are limited. In the first stage, systematic relationship analyses—comprising correlation, ARDL, cointegration, and Granger causality tests—identified ten key macroeconomic predictors from Central Bank datasets. In the second stage, we evaluated diverse predictive models, including XGBoost, Gradient Boosting, Ridge, LSTM, and SVR, using rice prices as a pilot case. A critical methodological contribution is the empirical comparison of feature engineering strategies; results demonstrate that traditional “smoothing” techniques dilute volatility signals, whereas the “Log-Return Transformation Strategy” strategy significantly improves accuracy. XGBoost emerged as the champion model, achieving a remarkable R2 of 0.932 (MAE: 1.68 TL) on the test set. To strictly validate this performance against small-sample limitations, a Recursive Walk-Forward Validation was conducted, confirming the model’s robustness with a strong R2 of 0.870 over a 31-month rolling simulation. Furthermore, Robust Rolling SHAP analysis identified Insurance and Transportation costs as primary drivers, evidencing a strong cost-push mechanism and inflation inertia. These findings integrate econometric rigor with machine learning transparency, offering resilient early warning tools for sustainable inflation management.