Predicting Later-Life Self-Rated Health from Childhood Living Conditions Using Machine Learning Approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Previous studies have suggested that childhood living conditions contribute to health inequalities in later life. We employed 8 machine learning (ML) regression models to evaluate the importance and heterogenous associations of a wide range of childhood living conditions on self-rated health in later life. Methods We used data from the China Health and Retirement Longitudinal Study (CHARLS) on 59 childhood living conditions across 8 domains, along with self-rated health assessments from before (2018) and during (2020) the COVID-19 pandemic among older adults across 150 counties (N = 15,461; 53% women). Results CatBoost was the best-performing ML model. We mapped out the overall and domain-specific importance of the 59 childhood living conditions and identified the most critical conditions in the 8 domains such as food deficiency, self-rated health before age 15, exposure to civil war, and being bullied by neighborhood kids. For women, childhood family financial situation, male guardian upset and having a group of friends were more influential than among men. For rural residents, childhood family financial situation, relationship with the male guardian and male guardian upset played a more important role compared to urban residents. During the COVID-19 pandemic, childhood perceived neighborhood safety and neighborhood closeness played a more critical role comparing to before the pandemic. Conclusion Our findings suggest the potential for developing a quick screening tool for early intervention and targeted policy. Additionally, these findings may have implications for other aging countries in East Asia or other low- and middle-income contexts.