Unified Cross-Modal Learning for Hydrological Processes Using Multi-Task Transformer Framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Most deep learning studies in hydrology adopt single-task frameworks that address individual variables such as rainfall or streamflow independently, limiting opportunities for shared learning across related environmental processes. This study introduces a unified multi-task, multi-modal deep learning framework capable of jointly performing 24-hour horizon streamflow forecasting and rainfall temporal super-resolution. The model employs a shared Transformer encoder with task-specific decoders to integrate temporal and spatial hydrological information within a single architecture. To assess the influence of joint optimization, the same model is also trained individually for each task, enabling direct comparison between single-task and multi-task configurations and performance. Results show that multi-task training maintains or modestly improves predictive accuracy relative to individually trained counterparts while preserving performance comparable to established baselines. The framework demonstrates stable streamflow forecasts and hydrologically consistent rainfall reconstructions, highlighting the potential of unified, process-aware architectures for representing multiple components of the hydrological cycle within one coherent learning system.