Sentinel-2 Land Cover Classification: State-of-the-Art Methods and the Reality of Operational Deployment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This review examines recent advances in Land Use and Land Cover (LULC) classification using Sentinel-2 imagery, with particular focus on the discrepancy between benchmark results and operational performance. While controlled benchmarks such as EuroSAT routinely report accuracies above 98%, real-world systems deployed at regional or global scales achieve only 75-85%. Through critical analysis of recent literature (2020-2024), we identify three main factors behind this gap: (i) methodological issues, most notably the inflation of reported accuracies due to spatial autocorrelation; (ii) domain adaptation challenges, where geographic and temporal transferability can reduce accuracy by 15-25%; and (iii) training data limitations, where geographic diversity proves more important than absolute sample size. Multi-spectral approaches yield consistent but modest gains of 5-8% over RGB, though at significantly higher computational costs. Comparisons with operational products such as ESA WorldCover and Google Dynamic World confirm the more modest performance achievable under real-world conditions. The findings emphasize the need for rigorous spatial validation, standardized evaluation protocols, and closer integration between research and operational development. Emerging approaches including foundation models, active learning, and multi-modal integration offer promising directions for narrowing the benchmark-to-operations gap.