Mamba for Remote Sensing: Architectures, Hybrid Paradigms, and Future Directions

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Modern Earth observation combines high spatial resolution, wide swath, and dense temporal sampling, producing image grids and sequences far beyond the regime of standard vision benchmarks. Convolutional networks remain strong baselines but struggle to aggregate kilometre-scale context and long temporal dependencies without heavy tiling and downsampling, while Transformers incur quadratic costs in token count and often rely on aggressive patching or windowing. Recently proposed visual state-space models, typified by Mamba, offer linear-time sequence processing with se-lective recurrence and have therefore attracted rapid interest in remote sensing. This survey analyses how far that promise is realised in practice. We first review the theoretical substrates of state-space models and the role of scanning and serialization when mapping two- and three-dimensional EO data onto one-dimensional sequences. A taxonomy of scan paths and architectural hybrids is then developed, covering cen-tre-focused and geometry-aware trajectories, CNN– and Transformer–Mamba back-bones, and multimodal designs for hyperspectral, multisource fusion, segmentation, detection, restoration, and domain-specific scientific applications. Building on this ev-idence, we delineate the task regimes in which Mamba is empirically warranted—very long sequences, large tiles, or complex degradations—and those in which simpler op-erators or conventional attention remain competitive. Finally, we discuss green com-puting, numerical stability, and reproducibility, and outline directions for phys-ics-informed state-space models and remote-sensing-specific foundation architectures. Overall, the survey argues that Mamba should be used as a targeted, scan-aware com-ponent in EO pipelines rather than a drop-in replacement for existing backbones, and aims to provide concrete design principles for future remote sensing research and op-erational practice.

Article activity feed