Spatiotemporal prediction of Soil Organic Carbon Density in Europe (2000–2022) using Earth Observation and Machine Learning

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The paper describes a comprehensive framework for soil organic carbon density (SOCD) (kg/m3) modeling and mapping, based on spatiotemporal Random Forest (RF) and Quantile Regression Forests (QRF). A total of 45,616 SOCD observations and various Earth Observation (EO) feature layers were used to produce 30m SOCD maps for the EU at four-year intervals (2000--2022) and four soil depth intervals (0-20cm, 20-50cm, 50-100cm, and 100-200cm). Per-pixel 95% probability prediction intervals (PIs) and extrapolation risk probabilities are also provided. Model evaluation indicates good overall accuracy (R2 = 0.63 and CCC = 0.76 for hold-out independent tests). Prediction accuracy varies by land cover, depth interval and year of prediction with accuracy the worst for shrubland and deeper soils 100--200cm. PI validation confirmed effective uncertainty estimation, though with reduced accuracy for higher SOCD values. Shapley analysis identified soil depth as the most influential feature, followed by vegetation, long-term bioclimate, and topographic features. While pixel-level uncertainty is substantial, spatial aggregation reduces uncertainty by approximately 66%. Detecting SOCD changes remains challenging but offers a baseline for future improvements. Maps, based primarily on topsoil data from cropland, grassland, and woodland, are best suited for applications related to these land covers and depths. We recommend that users interpret the maps in conjunction with local knowledge and consider the accompanying uncertainty and extrapolation risk layers. All data and code are available under an open license at https://doi.org/10.5281/zenodo.13754343 and https://github.com/AI4SoilHealth/SoilHealthDataCube/.

Article activity feed