Geographic Bias Analysis and Cross-Domain Generalization in Deep Learning-Based Building Damage Assessment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Automated building damage assessment from satellite imagery has become increasingly critical for rapid disaster response and humanitarian relief operations. However, current state-of-the-art deep learning models exhibit significant generalization challenges when deployed to geographically and environmentally diverse regions. This study investigates the nature and extent of geographic bias in building damage detection systems, revealing that model performance degradation stems primarily from geographic and structural characteristics rather than insufficient training data representation. Through systematic evaluation of top-performing xView2 competition solutions across 17 disaster locations spanning multiple climate zones, we found that even state-of-the-art models struggle with generalization, particularly for Minor and Major damage classes, and exhibit strong geographic biases toward certain regions. Strikingly, Nepal despite having the largest training dataset (15,234 images) achieves the worst performance, demonstrating that geographic and structural characteristics dominate generalization behavior more than training data quantity. To address these fundamental limitations, we explore Fusion Augmentation, a novel methodology that enhances edge detection and structural feature representation by integrating auxiliary information channels with standard RGB imagery. Experimental results demonstrate substantial improvements of 7.1% overall F1 score, with dramatic gains for intermediate damage categories such as Minor and Major damage. Domain adaptation experiments on three unseen locations show that combining Fusion Augmentation with supervised fine-tuning yields 40.8% and 60.0% improvements over Minor and major classes, while unsupervised CORAL achieves 24.2% and 39.5% improvements over Minor and major damage classes compared to benchmarks. These findings challenge prevailing assumptions about data-driven generalization in remote sensing AI systems and demonstrate that structural feature enhancement combined with domain adaptation is essential for robust detection across geographically diverse deployment scenarios, providing practical strategies for globally deployable damage assessment systems.