Application of Generalized Linear Mixed Models with Machine Learning validation for Longitudinal Seizure Count Data: A Clinical Trials in Ethiopia

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction: Epilepsy is a serious neurological condition that has a significant impact on public health, especially in low-and middle-income countries. Longitudinal seizure count data from clinical trials pose significant analytical clinical trials due to overdispersion, within-subject correlation, and non-normal random-effects distributions. Traditional statistical models often fail to adequately address these complexities. Methods This study analyzed longitudinal seizure count data from a randomized controlled clinical trial involving 2403 adult epilepsy patients in North Gondar, Ethiopia, followed for up to 27 weeks. An integrated analytical framework combining Generalized Linear Mixed Models (GLMMs) and machine learning validation techniques was employed. Poisson and negative binomial GLMMs were fitted to account for repeated measurements and individual-level heterogeneity. Results The Negative Binomial GLMM with random intercepts and slopes provides the best fit (AIC = 5419.33, BIC = 5492.94, pseudo-R²=0.78). Progabide Treatment was associated with a 30% decrease in seizure incidence compared with placebo (IRR = 0.70, 95% CI: 0.62–0.80, p < 0.001). Higher baseline seizure frequency, alcohol use, stroke, traumatic brain injury, and brain infection were significantly associated with increased seizure incidence, whereas literacy showed a protective effect. Random-effects analysis revealed substantial between-subject heterogeneity in baseline seizure rates (τ₀² = 0.41; p < 0.001) and seizure trajectories over time (τ₁² = 0.11; p = 0.008). Machine learning validation demonstrated consistent predictive performance across several validation methods, proving the robustness of the model. Conclusions In the analysis of longitudinal seizure count data, Negative Binomial Generalized Linear Mixed Models effectively account for individual heterogeneity, within-subject correlation, and overdispersion. The integration of resampling-based validation techniques with comprehensive model diagnostics enhances both predictive accuracy and interpretability. This analytical approach is well suited for epilepsy research and therapeutic decision-making in resource-limited settings, as it provides robust evidence on treatment effectiveness and identifies key risk factors influencing seizure frequency.

Article activity feed