BabyPy: a brain-age model for infancy, childhood, and adolescence
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Intro
Brain-age models quantify biological ageing by predicting a person’s age from neuroimaging data. In early life, brain-age can reflect underlying biological maturity (or immaturity), providing a candidate predictor of typical neurodevelopment versus deviation. Although widely used in adult research, the use of brain-age in early development has been limited due to data availability, heterogeneity and restricted model accessibility. Here, we introduce BabyPy, a shareable brain-age model for individuals aged 0–17 years that achieves accurate predictions despite substantial variability in site, scanner, and preprocessing pipelines.
Methods
We trained BabyPy on 4,021 structural T1-weighted MRI scans from multi-site datasets (ages 0–17 years). An external test set of 1,143 scans (ages 0–16 years) was used for validation. Coarse neuroimaging features - grey matter volume (GMV), white matter volume (WMV), and subcortical grey matter volume (sGMV) - along with sex, were the model inputs. An ensemble machine learning approach combined Extra Trees Regression, Support Vector Machine, and Multilayer Perceptron base models. Performance was evaluated via 5-fold cross-validation and external testing.
Results
The ensemble meta-model explained 80% of the variance in age (training set, MAE = 1.55 years) and 46% of the variance in the external test set (MAE = 1.72 years).
Conclusion
BabyPy is a shareable framework that estimates brain-age across a broad developmental range, removing the need for separate age-specific models. Despite limitations due to data heterogeneity, it demonstrates robust predictive performance and supports cross-study comparisons. Future improvements in data harmonisation will further enhance the utility of generic brain-age models like BabyPy.
Keypoints
-
Broad developmental coverage: BabyPy provides a shareable brain-age model for individuals aged 0–17 years, removing the need for separate age-specific models in the primary developmental period.
-
Minimal features, robust performance: Using just three coarse volumetric measures (GMV, WMV, sGMV) plus sex, BabyPy achieves good predictive accuracy (R 2 = 0.80, MAE = 1.55 years in training; R 2 = 0.46, MAE = 1.72 years in external validation) across diverse sites, scanners, and preprocessing protocols.
-
Open Science contribution: BabyPy is freely available as a Python-based toolbox, facilitating easy adoption, reproducibility, and cross-study comparisons in developmental neuroimaging.