MEGB: An R package for Mixed Effect GradientBoosting for High-dimensional Longitudinal Data

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

High-dimensional longitudinal data present significant analytical challenges due to intricate within-subject correlations and an overwhelming ratio of predictors to observations. To address these challenges, we introduce Mixed-Effect Gradient Boosting (MEGB), a novel R package that synergizes gradient boosting with mixed-effects modeling to simultaneously account for population-level fixed effects and subject-specific random variability. MEGB provides a unified framework for analyzing repeated measures data that accommodate complex covariance structures while harnessing gradient boosting’s inherent regularization for robust feature selection and prediction. In comprehensive simulations spanning linear and nonlinear data-generating processes, MEGB achieved 35–76% lower mean squared error (MSE) compared to state-of-the-art alternatives like Mixed- Effect Random Forests (MERF) and REEMForest, while maintaining 55–70% true positive rates for variable selection in ultra-high-dimensional regimes (p = 2000). Demonstrating practical utility, we applied MEGB to maternal cell-free plasma RNA data (n = 12 subjects, p = 33,297 transcripts), where it identified 9 key placental transcripts driving fetal RNA dynamics across pregnancy trimesters.

Article activity feed