Multimodal Machine Learning Prediction of Adult Wellbeing: Combining longitudinal change variables of childhood problem behaviour with polygenic risk scores
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background:Wellbeing is a complex construct showing continuities from childhood toadulthood and is influenced by many different factors, including genetics.Little is known about whether changes in behaviour associated withpsychopathology throughout childhood are predictive of adult wellbeing.This study aimed to combine multiple sources of data to improve the machinelearning-based prediction of adult wellbeing, using longitudinal dataof the Netherlands Twin Register; collected 1991 – 2023, N = 5, 087.Methods:Cross-sectional item scores, longitudinal information summary statistics(e.g., means, standard deviations, autocorrelation) and latent growth modellingon childhood problem behaviour were combined with polygenicscores of phenotypes associated with wellbeing. Different machine learningmodels (random forest, XGBoost, stacked ensemble linear regression)were trained on different uni- and multimodal variable sets. Variable importanceanalyses provided insight into which variables from which datamodality showed the strongest influence on the predictions.Results:Models combining all data modalities (cross-sectional and longitudinalinformation on childhood behaviour and polygenic scores) descriptivelyperformed best (RMSE = 0.97; R² = 0.20). Yet, most of the othermodels did not perform significantly worse. Variables from all includedmodalities occurred among the most influential predictors. Longitudinalsummary statistics (e.g. means, standard deviations, autocorrelation)were more predictive than variables derived from latent growth models.Interestingly, the predictive longitudinal variables were sometimes derivedfrom items that were not important predictors themselves.Conclusions:Our results mirror the complexity of wellbeing with many variables havingsmall but significant effects. The potential of combining different datamodalities for the prediction of adult wellbeing is highlighted.Keywords: •Wellbeing • Quality of life • childhood psychopathology •longitudinal trajectories • machine learning • latent growth modeling