MS-GWCT: A Multi-Scale Graph Wavelet Convolutional Transformer for HSI-LiDAR Classification

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Hyperspectral images (HSIs) and Light Detection and Ranging (LiDAR) provide complementary spectral and structural information for remote sensing tasks such as land cover classification. To utilize these modalities, we propose a hybrid Graph Wavelet Convolution Transformer (MS-GWCT) model that combines multi-scale graph wavelet convolutions with Transformer-based attention mechanisms. The model constructs a graph over labeled pixels, using normalized adjacency matrices with optional Gaussian weighting based on spectral and spatial similarities. Graph wavelet convolutions, approximated via Chebyshev polynomials, enable multi-scale feature extraction, while Transformer blocks on the graph capture long-range dependencies and cross-modal interactions. The training process includes focal loss, label smoothing, a supervised contrastive loss, Mixup data augmentation, and stochastic graph augmentation to enhance robustness. Key hyperparameters are automatically optimized using the Schrödinger Optimization Algorithm (SOA). Experiments on the Houston 2013 and Trento datasets show state-of-the-art performance with limited training samples, achieving overall accuracies of 92.60% and 98.86%, respectively, using only 50 labeled pixels per class. Ablation studies confirm the contributions of multi-scale graph convolutions, attention modules, and training strategies, while robustness analyses highlight the model’s effectiveness in conditions of label scarcity and class imbalance.

Article activity feed