Evaluating Traditional, Machine Learning, and Deep Learning Models for Predicting Outpatient Healthcare Utilization in Europe: A Longitudinal Panel Analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Accurate prediction of outpatient utilization supports responsive health-care planning. Using longitudinal micro-data from the Survey of Health, Ageing and Retirement in Europe (SHARE)—10,777 adults observed across Waves 5, 6, 8, and 9 (2013, 2015, 2019–2020, and 2021–2022, respectively)—we forecast Wave-9 outpatient visit counts from Waves 5–6–8 history. We benchmark generalized estimating equations (GEE) and generalized linear mixed models (GLMM) against sequence-aware neural architectures—LSTM, GRU, CNN, CNN-LSTM, CNN-GRU, and a feed-forward DNN—under strict leakage guards (no Wave-9 predictors), train-only imputation/scaling, and a fixed 80/20 row-level split with performance uncertainty estimated via 400 bootstrap replicates. Baselines follow standard GEE/GLMM formulations.

Across models, adding a second historical wave materially improves accuracy, with smaller gains from a third wave. The best configuration—a CNN–GRU with a shallow causal 1-D convolutional front-end and a two-layer GRU head—achieves MAE = 3.41 (95% CI 3.24–3.59) and RMSE = 4.91 (4.60–5.23) on the held-out test set, modestly outperforming a tuned GRU (MAE = 3.46; RMSE = 5.21) and the other deep baselines. Overall, results indicate that sequence models capture short-horizon temporal structure in SHARE, but incremental gains hinge more on temporal depth and careful regularization than on architectural complexity alone. Methodologically, the study demonstrates a leak-safe pipeline for panel forecasting in population health using modern recurrent/convolutional networks alongside classical longitudinal benchmarks.

Article activity feed