Approximating spatial processes with too many knots degrades the quality of probabilistic predictions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
1. Spatial and spatiotemporal models are increasingly used in ecology for a range of purposes, such as tracking population change, assessing species distributions, and modelling spatial processes. Many models, including generalized additive models (GAMs) and Gaussian random fields (fit via the Stochastic Partial Differential Equation [SPDE] approach), approximate spatial surfaces using basis functions defined over a set of knots or mesh vertices. A common assumption is that higher resolution approximations with more knots yield better predictions, but this is rarely tested.
2. We develop three case studies and a simulation to understand how mesh resolution affects prediction quality and parameter estimates. Using temperature and groundfish data from U.S. West Coast fish surveys, we developed spatial and spatiotemporal models of ocean temperature and fish biomass density. We fit models with varying mesh complexity and assessed prediction quality using log predictive density (log score) with k-fold cross-validation. We also compared SPDE approaches to GAMs.
3. Our temperature models showed that the finest scale meshes decreased log predictive density. As mesh complexity increased, estimated spatial range, spatial variance, and observation error declined. Groundfish models showed similar patterns: more vertices led to smaller estimated spatial range and variance parameters. For most species, mesh resolution had little effect on area-weighted biomass indices, but it did affect scale for some. Simulations showed that predictive density declined with increasing mesh complexity, especially under fine-scale spatial variability, high spatial variance, and low observation error. Root mean square error remained stable, indicating that degraded predictive densities stemmed from poorly calibrated uncertainty estimates rather than reduced accuracy of predictions.
4. Our work highlights that practitioners should not assume predictive performance always improves with increased spatial complexity. Selecting appropriate spatial complexity is expected to improve parameter estimation accuracy and derived quantities when out-of-sample prediction is a focus of inference.