Application of Generalized Linear Mixed Models with Machine Learning validation for Longitudinal Seizure Count Data: A Clinical Trials in Ethiopia

Abay Kassie Lakew
Tezera Abebe Gashaw
Ebabu Shibabaw Yirdaw

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Introduction: Epilepsy is a serious neurological condition that has a significant impact on public health, especially in low-and middle-income countries. Longitudinal seizure count data from clinical trials pose significant analytical clinical trials due to overdispersion, within-subject correlation, and non-normal random-effects distributions. Traditional statistical models often fail to adequately address these complexities. Methods This study analyzed longitudinal seizure count data from a randomized controlled clinical trial involving 2403 adult epilepsy patients in North Gondar, Ethiopia, followed for up to 27 weeks. An integrated analytical framework combining Generalized Linear Mixed Models (GLMMs) and machine learning validation techniques was employed. Poisson and negative binomial GLMMs were fitted to account for repeated measurements and individual-level heterogeneity. Results The Negative Binomial GLMM with random intercepts and slopes provides the best fit (AIC = 5419.33, BIC = 5492.94, pseudo-R²=0.78). Progabide Treatment was associated with a 30% decrease in seizure incidence compared with placebo (IRR = 0.70, 95% CI: 0.62–0.80, p < 0.001). Higher baseline seizure frequency, alcohol use, stroke, traumatic brain injury, and brain infection were significantly associated with increased seizure incidence, whereas literacy showed a protective effect. Random-effects analysis revealed substantial between-subject heterogeneity in baseline seizure rates (τ₀² = 0.41; p < 0.001) and seizure trajectories over time (τ₁² = 0.11; p = 0.008). Machine learning validation demonstrated consistent predictive performance across several validation methods, proving the robustness of the model. Conclusions In the analysis of longitudinal seizure count data, Negative Binomial Generalized Linear Mixed Models effectively account for individual heterogeneity, within-subject correlation, and overdispersion. The integration of resampling-based validation techniques with comprehensive model diagnostics enhances both predictive accuracy and interpretability. This analytical approach is well suited for epilepsy research and therapeutic decision-making in resource-limited settings, as it provides robust evidence on treatment effectiveness and identifies key risk factors influencing seizure frequency.

Version published to 10.21203/rs.3.rs-8610618/v1 on Research Square
Mar 2, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed