Predicting AquaCrop-Simulated Durum Wheat Yield with Machine Learning: Algorithm Comparison and Agronomic Signal Convergence in the Capitanata Plain

Pasquale Garofalo
Anna Rita Bernadette Cammerino
Maria Riccardi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Five machine learning algorithms — Linear Regression (LR), Multilayer Perceptron (MLP), Support Vector Machine for regression (SMOreg), RandomTree, and Reduced Error Pruning Tree (REPTree) — were trained and compared for predicting durum wheat (Triticum durum Desf.) grain yield simulated by AquaCrop-GIS across the Capitanata plain (Southern Italy). A dataset of 342 instances was constructed by crossing 25 soil profiles, three sowing dates, and two irrigation regimes over 15 climatic grid cells (2014–2023), validated by stratified 10-fold cross-validation. MLP achieved the highest accuracy (R = 0.983; MAE = 0.059 t ha-1; RMSE = 0.083 t ha-1); the four interpretable models clustered at R = 0.891–0.907 (RMSE = 0.192–0.203 t ha-1). All models converged on consistent agronomic signals: standard sowing (1 November) yielded +0.53 t ha-1 over late sowing (15 November); supplemental irrigation added +0.17 t ha-1; high-silt and clay soils produced superior yields. The SMOreg normalised weight vector identified autumn temperature (Tmin_oct_nov: −0.462; Tmax_oct_nov: −0.405) as the dominant climate predictor, reflecting the AquaCrop phenological mechanism whereby elevated early-season thermal loads curtail tillering. The convergence of directional signals across fundamentally different algorithmic architectures — linear, kernel-based, and tree-based — confirms that ML surrogates can efficiently emulate AquaCrop response surfaces for scenario analysis and decision-support in Mediterranean dryland farming systems.

Version published to 10.20944/preprints202603.1628.v1
Mar 20, 2026

Wheat Yield Prediction Based on Random Forest Method

This article has 1 author:
1. Yared Semahegn
This article has no evaluationsLatest version Mar 25, 2026
Machine Learning and Explainable AI for Agricultural Drought Prediction: A Comparative Analysis of Gradient Boosting Methods Using Multi-Source Earth Observation Data

This article has 4 authors:
1. Mirza Md Tasnim Mukarram
2. Quazi Umme Rukiya
3. Marc Linderman
4. Jun Wang
This article has no evaluationsLatest version Feb 21, 2026
Explainable Machine Learning for Crop Yield Classification Using Foliar Nutrient Analysis and Management Data in Colombia

This article has 2 authors:
1. Yeison Eduardo Conejo Sandoval
2. Andres Polo
This article has no evaluationsLatest version Mar 26, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Wheat Yield Prediction Based on Random Forest Method

Machine Learning and Explainable AI for Agricultural Drought Prediction: A Comparative Analysis of Gradient Boosting Methods Using Multi-Source Earth Observation Data

Explainable Machine Learning for Crop Yield Classification Using Foliar Nutrient Analysis and Management Data in Colombia