Predicting AquaCrop-Simulated Durum Wheat Yield with Machine Learning: Algorithm Comparison and Agronomic Signal Convergence in the Capitanata Plain

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Five machine learning algorithms — Linear Regression (LR), Multilayer Perceptron (MLP), Support Vector Machine for regression (SMOreg), RandomTree, and Reduced Error Pruning Tree (REPTree) — were trained and compared for predicting durum wheat (Triticum durum Desf.) grain yield simulated by AquaCrop-GIS across the Capitanata plain (Southern Italy). A dataset of 342 instances was constructed by crossing 25 soil profiles, three sowing dates, and two irrigation regimes over 15 climatic grid cells (2014–2023), validated by stratified 10-fold cross-validation. MLP achieved the highest accuracy (R = 0.983; MAE = 0.059 t ha-1; RMSE = 0.083 t ha-1); the four interpretable models clustered at R = 0.891–0.907 (RMSE = 0.192–0.203 t ha-1). All models converged on consistent agronomic signals: standard sowing (1 November) yielded +0.53 t ha-1 over late sowing (15 November); supplemental irrigation added +0.17 t ha-1; high-silt and clay soils produced superior yields. The SMOreg normalised weight vector identified autumn temperature (Tmin_oct_nov: −0.462; Tmax_oct_nov: −0.405) as the dominant climate predictor, reflecting the AquaCrop phenological mechanism whereby elevated early-season thermal loads curtail tillering. The convergence of directional signals across fundamentally different algorithmic architectures — linear, kernel-based, and tree-based — confirms that ML surrogates can efficiently emulate AquaCrop response surfaces for scenario analysis and decision-support in Mediterranean dryland farming systems.

Article activity feed