Evaluating Traditional, Machine Learning, and Deep Learning Models for Predicting Outpatient Healthcare Utilization in Europe: A Longitudinal Panel Analysis

Vincent Cheng-Sheng Li
Tzu-Pin Lu
Charlotte Wang
Kanya Anindya
John Tayu Lee

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Accurate prediction of outpatient utilization supports responsive health-care planning. Using longitudinal micro-data from the Survey of Health, Ageing and Retirement in Europe (SHARE)—10,777 adults observed across Waves 5, 6, 8, and 9 (2013, 2015, 2019–2020, and 2021–2022, respectively)—we forecast Wave-9 outpatient visit counts from Waves 5–6–8 history. We benchmark generalized estimating equations (GEE) and generalized linear mixed models (GLMM) against sequence-aware neural architectures—LSTM, GRU, CNN, CNN-LSTM, CNN-GRU, and a feed-forward DNN—under strict leakage guards (no Wave-9 predictors), train-only imputation/scaling, and a fixed 80/20 row-level split with performance uncertainty estimated via 400 bootstrap replicates. Baselines follow standard GEE/GLMM formulations.

Across models, adding a second historical wave materially improves accuracy, with smaller gains from a third wave. The best configuration—a CNN–GRU with a shallow causal 1-D convolutional front-end and a two-layer GRU head—achieves MAE = 3.41 (95% CI 3.24–3.59) and RMSE = 4.91 (4.60–5.23) on the held-out test set, modestly outperforming a tuned GRU (MAE = 3.46; RMSE = 5.21) and the other deep baselines. Overall, results indicate that sequence models capture short-horizon temporal structure in SHARE, but incremental gains hinge more on temporal depth and careful regularization than on architectural complexity alone. Methodologically, the study demonstrates a leak-safe pipeline for panel forecasting in population health using modern recurrent/convolutional networks alongside classical longitudinal benchmarks.

Version published to 10.1101/2025.09.09.25335390 on medRxiv
Sep 10, 2025

Correcting Algorithmic Bias in Machine Learning Prediction of Healthcare utilization in India

This article has 8 authors:
1. John Tayu Lee
2. Vincent Cheng-Sheng Li
3. Sheng Hui Hsu
4. Tzu-Pin Lu
5. Charlotte Wang
6. Arokiasamy Perianayagam
7. Kanya Anindya
8. Rifat Atun
This article has no evaluationsLatest version Sep 8, 2025
Predictive Diagnosis of Cardiovascular Disease Using an Optimized RNN- GRU Hybrid Model

This article has 2 authors:
1. Gaurav Kumar
2. Neeraj Varshney
This article has no evaluationsLatest version Aug 14, 2025
Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset

This article has 10 authors:
1. Taofiq Olanrewaju MUSA
2. Arsene ADJEVI
3. Donaldo Omondi JACCOJWANG
4. Nasirudeen ADELEYE
5. Diyaolu Abdulmalik OPEYEMI
6. Süleyman UZUN
7. Mustafa Zahid YILDIZ
8. Ali LAZIM
9. Rhobi Peter
10. Selçuk YAYLACI
This article has no evaluationsLatest version Aug 21, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Correcting Algorithmic Bias in Machine Learning Prediction of Healthcare utilization in India

Predictive Diagnosis of Cardiovascular Disease Using an Optimized RNN- GRU Hybrid Model

Comparative Evaluation of Machine Learning and Deep Learning Models for Blood Glucose Prediction on the OhioT1DM Dataset