The Data Behind AI Coastal Forecasting: Inputs, Sources, and Preprocessing Approaches
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Coastal zones, shaped by marine and terrestrial processes, are home to over 40% of the global population and contribute significantly to the global economy. However, their attractiveness also makes them vulnerable to extreme coastal water levels (ECWLs), which can lead to catastrophic flooding. ECWLs, driven by sea-level changes, waves, and tidal variations, have become more frequent and severe due to climate change, resulting in significant loss of life and economic damage. Artificial intelligence (AI) has emerged as a powerful tool for forecasting oceanographic processes, leveraging its ability to capture the complex, non-linear relationships. However, the performance of AI models depends heavily on the availability, quality, and preparation of oceanographic data, which are often heterogeneous. This study reviews the data types, input features, spatial and temporal resolutions, data coverage, and pre-processing methods used in AI-driven forecasting of ECWL drivers, i.e., waves, tides, and sea level anomaly. The findings highlight the importance of in-situ measurements, remote sensing, numerical simulations, laboratory experiments, and reanalysis data in capturing different aspects of wave dynamics, while emphasising the need for improved data accessibility, integration, and longer datasets. The review also highlights research imbalances, such as limited attention to certain wave dynamics (e.g., wave spectra, wave energy flux), as well as data scarcity in less-resourced regions.