Vision Transformer based Damage Assessment from Post-Disaster Satellite Imagery: An Applied Study on Hurricane Harvey

Jakaria Habib
Md. Imran Hossain
Md. Basim Al Zabir Shammo
Md. Taz Warul Mulk
Estiyak Rubaiat

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The timely and accurate evaluation of building damage is vital for post-disaster response. This study evaluates the efficacy of Vision Transformers (ViT-B32) compared to a baseline Convolutional Neural Network (EfficientNet-B0) for the binary classification of damaged versus undamaged buildings, utilizing 128×128 RGB satellite imagery from Hurricane Harvey. As opposed to CNNs that encode local features, ViT structures rely on self-attention mechanisms for capturing global spatial relationships, a property that is crucial in spotting intricate damage patterns in visually noisy disaster sites. Experiments show that the ViT model outperforms the CNN baseline in terms of classification accuracy (97.85% vs 96.90%) and prove to be more robust on unbalanced data as shown with F1-score and AUC than respective values of close competitors. Furthermore, the study highlights the model's interpretability: we generate attention heatmaps that visualize the specific image regions driving the classification decisions. These visualizations provide actionable insights by precisely localizing structural damage, thereby offering a valuable tool for prioritizing recovery efforts in disaster management workflows.

Version published to 10.21203/rs.3.rs-8642178/v1 on Research Square
Feb 12, 2026

Research on improved SegFormer with multi-module fusion for landslide remote sensing image recognition

This article has 8 authors:
1. Minghua Luo
2. Canming Yuan
3. Rui Ma
4. Bibo Dai
5. Jinxin Huang
6. Xin Pan
7. Xu Wu
8. Zhixin Zhang
This article has no evaluationsLatest version Feb 3, 2026
A Deep Hybrid CNN–ViT Architecture Incorporating Advanced 3D Features for the Estimation of Visibility and Runway Visual Range

This article has 2 authors:
1. Anand Shankar
2. Bikash Chandra Sahana
This article has no evaluationsLatest version Feb 5, 2026
Geographic Bias Analysis and Cross-Domain Generalization in Deep Learning-Based Building Damage Assessment

This article has 5 authors:
1. Shruti Kshirsagar
2. Bharath Chandra
3. Unaza Tallal
4. Rajiv Bagai
5. Atri Dutta
This article has no evaluationsLatest version Mar 26, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Research on improved SegFormer with multi-module fusion for landslide remote sensing image recognition

A Deep Hybrid CNN–ViT Architecture Incorporating Advanced 3D Features for the Estimation of Visibility and Runway Visual Range

Geographic Bias Analysis and Cross-Domain Generalization in Deep Learning-Based Building Damage Assessment