Offset or not: guidance on accounting for sampling effort in generalized linear models

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

1. Observed data are often dependent on a measure of sampling effort, such as counts measured per unit area. A common tool to account for differences in effort is the ‘offset term’ in a generalized linear model, which allows for a fixed proportional relationship between effort and the response variable. However, there is limited detailed guidance on the application of offsets and transformations or when an estimated effort covariate might be more appropriate. 2. This article explores the parametrisation and implementation of the offset term, plus additional methods to account for sampling effort in regression models. We evaluate the performance of offsets and covariates across various data characteristics through simulation. 3. When uncertainty regarding the effort–response relationship exists, modelling sampling effort as a log-transformed covariate, ideally as a constrained smoother, is ideal because it covers most scenarios: a proportional relationship, a non-linear (e.g. saturated) relationship, and flexibility in multi-species or hurdle models (e.g. allowing effort to influence detection probability in a binomial model). I show that parameter recovery in effort-as-covariate models is generally robust in simple models, so a log-transformed offset is only advantageous when: a proportional relationship is well-supported, model complexity or data availability hinders covariate estimation, or non-linearity at data limits is uncertain. 4. Although our simulation showed reasonable performance of all sampling effort parameterisations, how to model effort remains a key decision, and one that benefits from considered thought before modelling occurs. The nature of the effort–response relationship (i.e. proportional, otherwise linear on the link or original scales, or non-linear), and how multiple effort variables could be included in the same model, will benefit from both statistical and practical contexts and experience.

Article activity feed