Evaluating Traditional, Machine Learning, and Deep Learning Models for Predicting Outpatient Healthcare Utilization in Europe: A Longitudinal Panel Analysis
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate prediction of outpatient utilization supports responsive health-care planning. Using longitudinal micro-data from the Survey of Health, Ageing and Retirement in Europe (SHARE)—10,777 adults observed across Waves 5, 6, 8, and 9 (2013, 2015, 2019–2020, and 2021–2022, respectively)—we forecast Wave-9 outpatient visit counts from Waves 5–6–8 history. We benchmark generalized estimating equations (GEE) and generalized linear mixed models (GLMM) against sequence-aware neural architectures—LSTM, GRU, CNN, CNN-LSTM, CNN-GRU, and a feed-forward DNN—under strict leakage guards (no Wave-9 predictors), train-only imputation/scaling, and a fixed 80/20 row-level split with performance uncertainty estimated via 400 bootstrap replicates. Baselines follow standard GEE/GLMM formulations.
Across models, adding a second historical wave materially improves accuracy, with smaller gains from a third wave. The best configuration—a CNN–GRU with a shallow causal 1-D convolutional front-end and a two-layer GRU head—achieves MAE = 3.41 (95% CI 3.24–3.59) and RMSE = 4.91 (4.60–5.23) on the held-out test set, modestly outperforming a tuned GRU (MAE = 3.46; RMSE = 5.21) and the other deep baselines. Overall, results indicate that sequence models capture short-horizon temporal structure in SHARE, but incremental gains hinge more on temporal depth and careful regularization than on architectural complexity alone. Methodologically, the study demonstrates a leak-safe pipeline for panel forecasting in population health using modern recurrent/convolutional networks alongside classical longitudinal benchmarks.