Identifying High-Risk Patient Clusters for Falls Using Unsupervised Machine Learning on Linked Primary and Secondary Care Electronic Health Record Data
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Falls risk is multifactorial, involving a combination of clinical and sociodemographic factors. Although guidelines acknowledge this complexity, most research has focused on individual risk factors, leaving the combined impact of comorbidities relatively understudied. This population-wide study used electronic health records (EHR) linked across primary and secondary care to identify falls risk profiles in the North West London (NWL) population and to stratify patients by their likelihood of requiring falls-related hospital care using unsupervised clustering. We conducted cluster analysis on patients from NWL General Practice records using coded falls risk factors. Cluster membership was compared against the risk of falls-related hospital encounters. Among four identified clusters, two groups of older, multimorbid patients were 11 times more likely to have a fall-related hospital encounter (RR 11.45, 95% CI 10.14–12.92 and RR 11.63, 95% CI 10.30–13.13) and had significantly longer mean length of stay compared with younger, fitter patients. Between two younger clusters, patients with higher deprivation levels were 29% more likely to have a fall-related hospital encounter (RR 1.29, 95% CI 1.12–1.49). These findings demonstrate that clustering routinely collected EHR data can identify population segments at highest risk of falls-related hospital use, supporting more targeted, multifactorial risk assessment and prevention strategies.