Advancing Urban Roof Segmentation: Transformative Deep Learning Models from CNNs to Transformers for Scalable and Accurate Urban Imaging Solutions A case study in Ben Guerir City, Morocco

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Urban roof segmentation plays a pivotal role in applications such as urban planning, infrastructure management, and renewable energy deployment. This study explores the evolution of deep learning techniques from traditional Convolutional Neural Networks (CNNs) to cutting-edge Transformer-based models in the context of roof segmentation from satellite imagery. We highlight the limitations of conventional methods when applied to urban environments, including resolution constraints and the complexity of roof structures. To address these challenges, we evaluate two advanced deep learning models: Mask R-CNN and MaskFormer, which have shown significant promise in accurately segmenting roofs, even in dense urban settings with diverse roof geometries. These models, especially the one based on transformers, offer improved segmentation accuracy by capturing both global and local image features, enhancing their performance in tasks where fine detail and contextual awareness are critical. A case study on Ben Guerir City in Morocco, an urban area experiencing rapid development, serves as the foundation for testing these models. Using high-resolution satellite imagery, the segmentation results offer a deeper understanding of the accuracy and effectiveness of these models, particularly in optimizing urban planning and renewable energy assessments. Quantitative metrics such as Intersection over Union (IoU), precision, recall, and F1-score are used to benchmark model performance. Mask R-CNN achieved a mean IoU of 74.6%, precision of 81.3%, recall of 78.9%, and F1-score of 80.1%. MaskFormer outperformed Mask R-CNN, reaching a mean IoU of 79.8%, precision of 85.6%, recall of 82.7%, and F1-score of 84.1%, highlighting the transformative potential of transformer-based architectures for scalable and precise urban imaging. The study also outlines future work in 3D modelling and height estimation, positioning these advancements as critical tools for sustainable urban development.

Article activity feed