Spatio-temporal machine learning for multi-horizon prediction of bluetongue outbreaks

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Reliable early warning of infectious disease outbreaks remains a major challenge for surveillance systems, particularly for vector-borne pathogens whose transmission depends on interactions among hosts, vectors, and climate-sensitive environmental conditions. Data-driven forecasting offers a promising approach for predicting outbreak risk using surveillance and environmental data. This study develops a logit-weighted ensemble (LWE), a machine-learning framework that predicts outbreak occurrence 1–6 months ahead at the administrative unit–month scale using routinely available outbreak notifications and gridded climate data. Bluetongue virus (BTV), an arbovirus of ruminants transmitted by Culicoides biting midges, provides a well-characterised system in which transmission is strongly shaped by climate, making it a useful system for applying and testing this approach. The framework is evaluated using surveillance data collected between 2005 and 2024 from France, Greece, and Italy, selected for their long-running and high-quality outbreak surveillance records. Across all three countries, the LWE achieved the strongest and most stable predictive performance under a recall-focused evaluation that prioritises correctly identifying outbreak months. It outperformed or matched 14 benchmark models, with differences becoming more pronounced at longer lead times (month +3 onward), when predictions are more uncertain and outbreaks are relatively rare. Predictability varied across countries, with the highest performance in Greece, strong performance in France, and lower, more variable performance in Italy, reflecting differences in how consistently outbreaks occur and spread across regions. Overall, the results demonstrate that horizon-aware, climate-informed forecasting can reliably identify months and locations at elevated risk of outbreak occurrence up to six months in advance, supporting surveillance planning and preparedness across heterogeneous European settings. The ensemble framework provides a robust and portable strategy for outbreak prediction using routinely collected surveillance and environmental data.

Author Summary

Predicting infectious disease outbreaks before they occur remains a major challenge, particularly for diseases influenced by environmental conditions. In this study, we focus on bluetongue, a viral disease of livestock transmitted by biting midges, where transmission is strongly affected by climate and seasonal patterns. We develop a method that uses routinely collected outbreak reports and climate data to estimate where and when outbreaks are more likely to occur, up to six months in advance. We apply this approach across three European countries with a history of bluetongue outbreaks. We find that combining climate information with recent outbreak patterns can provide useful early signals of increased risk. Predictions are most accurate at shorter timeframes, but longer-range forecasts can still support planning and preparedness. Because our approach uses widely available data, it could be applied in other regions or to similar environmentally driven diseases. However, it does not include factors such as vaccination, animal movement, or detailed information on vector populations, which may also influence how outbreaks develop.

Graphical Abstract

Article activity feed