Evaluating Deep Learning Change Detection in Aerial Imagery: A New Multi-Metric Benchmarking Framework

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Change detection (CD) in aerial images is a critical task in various applications such as urban expansion monitoring, environmental assessment, and disaster response. However, the existing literature often lacks comprehensive and systematic evaluations of deep learning (DL)-based CD models, leaving gaps in understanding their generalizability, robustness, and performance trade-offs across diverse conditions. This study addresses these gaps by proposing a novel framework for benchmarking and assessing CD models, offering a detailed and quantitative evaluation of five state-of-the-art models: CSA-CDGAN, Changeformer, BIT, Tiny, and SNUNet. Our framework consists of three distinct evaluation pipelines: (1) cross-testing across diverse benchmark datasets to assess generalization, (2) sensitivity analysis to examine model performance with respect to change size and complexity, and (3) robustness analysis to evaluate resilience against image corruptions and noise. Key results demonstrate the utility of our framework in revealing the strengths and weaknesses of the evaluated models. CSA-CDGAN excels in handling high noise levels, which showed the highest precision and F1 score, and maintained strong recall across a wide noise spectrum. Changeformer outperforms others in moderately noisy conditions (30-31 dB), while Tiny excels in detecting smaller changes under severe noise (29.35-29.5 dB). Additionally, the framework highlights the challenges faced by BIT, particularly its lower performance in both precision and recall, making it less suited for high-noise environments. This comprehensive benchmarking framework provides critical insights for selecting suitable CD models based on real-world application needs, considering factors like noise levels, change sizes, and dataset variability. The results also lay the groundwork for future research, guiding the development of more robust and versatile CD models. The study establishes a new standard for model evaluation, offering a systematic approach to improve the reliability and applicability of CD models in practical scenarios.

Article activity feed