An effective framework for accurate semantic segmentation of high-resolution remote sensing images.

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Land cover maps produced through semantic segmentation of high-resolution remotely sensed images are a key focus in photogrammetry and remote sensing research. While advances in acquisition technologies have made high-resolution remote sensing (HRRS) images widely available, accurate semantic segmentation remains challenging due to class imbalance, object occlusion, and variations in object size. Deep convolutional neural networks (DCNNs) offer strong feature learning capabilities; however, extracting sufficient features from HRRS images for precise segmentation remains a challenging task. Effective models must learn robust, multi-contextual features to address these challenges, including coverage of varied object sizes and class disparities while balancing computation cost. Additionally, deeper networks often lose spatial detail during downsampling, leading to coarse segmentation boundaries. To address these issues, we propose a stacked deep residual network (SDRNet) for semantic segmentation of HRRS images. SDRNet employs two stacked encoder-decoder networks to capture long-range semantics while preserving spatial information. Dilated residual blocks (DRB) are introduced between each encoder and decoder to capture global dependencies, and attention blocks further refine the feature learning process. An intermediate loss is applied mid-way through the network to supervise learning in the middle layers. Experiments on the ISPRS Vaihingen and Potsdam datasets show that SDRNet performs competitively well, achieving overall accuracies of 90.82% and 90.62%, respectively.

Article activity feed