Performance assessment of statistical and deep learning models for predicting chloride trends in island groundwater systems influenced by saltwater intrusion and sea level rise

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In this study, two time-series statistical models, autoregressive integrated moving average (ARIMA) and generalized autoregressive conditional heteroskedasticity (GARCH), and one deep-learning model, a stacked long short-term memory (LSTM) neural network, were applied to evaluate the predictive performance of groundwater chloride concentrations in Guam, an island in the western Pacific. The analysis used 42 years (1980–2021) of chloride concentration data from two coastal production wells with relatively low (F11) and high (F04) variance, enabling comparison of data characteristics and model performance. Among the models tested, LSTM exhibited strong predictive performance with a configuration of 1,000 epochs, 32 blocks, two hidden layers, and a training ratio of 0.7. Since sea-level rise in Guam began around 1993, three chloride dataset scenarios were applied to the LSTM model to assess data variability before and after its onset, using relative mean squared error (MSE) and the training-to-test relative MSE ratio. The results showed that the high-variance F04 dataset captured clear changes after 1993, indicating the influence of sea-level rise, whereas the low-variance F11 dataset did not. This suggests that high-variance datasets better reflect the effects of sea-level rise, whereas low-variance datasets may be driven by hydrogeological properties, such as lower hydraulic conductivity and porosity complexity, which can retard saltwater–groundwater interaction, or influenced by factors such as rainfall. Therefore, incorporating hydrogeological information is important for reliable interpretation of time-series model results, especially if data variability is insufficient to robustly evaluate predictive performance in long-term, field-based groundwater quality datasets.

Article activity feed