External validation and head-to-head comparison of Eclipse-PRISM and Johns Hopkins ACG risk scores for predicting emergency admissions in an English older population
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Predictive risk stratification tools are widely used to support proactive care for older adults, yet head-to-head external validation within local English health systems remains limited. Eclipse-PRISM implemented in UK primary care settings, while the Johns Hopkins Adjusted Clinical Groups (ACG) system provides risk scores derived from diagnosis groupings and healthcare utilisation data.
Aim
To externally validate and compare PRISM and Johns Hopkins ACG scores for predicting emergency hospital admission among older adults in an English integrated care system.
Methods
We conducted a retrospective cohort study in the Norfolk and Waveney Integrated Care System. Individuals aged ≥75 years at the index date with valid PRISM and ACG emergency admission risk scores and linkage to hospital activity data were included. The primary outcome was ≥1 emergency hospital admission within 1 month. Discrimination was assessed using the area under the receiver operating characteristic curve (AUC), with paired AUCs compared using DeLong’s test. Calibration was evaluated using calibration plots and quantified using calibration intercept and slope from logistic recalibration models. Overall accuracy was summarised using the Brier score. Clinical utility was assessed using decision curve analysis (DCA).
Results
The cohort included 114,407 patients aged ≥75 years; 2,136 (1.87%) had ≥1 emergency admission within 1 month. ePRISM showed higher discrimination than Johns Hopkins ACG (AUC 0.860 [95% CI 0.852 to 0.867] vs 0.739 [95% CI 0.728 to 0.749]; ΔAUC 0.121 [95% CI 0.111 to 0.130]; DeLong p<0.0001), with consistent differences across age and sex subgroups. Calibration differed materially: ePRISM showed closer agreement between predicted and observed risks, whereas Johns Hopkins ACG systematically overpredicted risk across much of the range. Brier scores favoured ePRISM (0.017 [95% CI 0.017 to 0.018] vs 0.051 [95% CI 0.050 to 0.051]). In DCA, ePRISM provided higher net benefit across clinically plausible thresholds, while Johns Hopkins ACG showed lower or negative net benefit across much of the threshold range.
Conclusions
In this English older population, ePRISM demonstrated higher discrimination and more favourable apparent calibration, overall accuracy and decision-analytic performance for predicting 1-month emergency admission than Johns Hopkins ACG. Model selection for short-term risk stratification should therefore consider calibration and clinical utility alongside discrimination, with local validation and recalibration where appropriate before implementation.