Post-ED Trajectory Prediction in Abdominal Pain with a Generative Medical Event Model

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Importance

Abdominal pain causes roughly 10 million US emergency department (ED) visits annually, most resulting in discharge. Post-discharge courses vary, yet existing risk models predict only whether an ED revisit occurs, not what that revisit outcome will entail.

Objective

To evaluate whether Curiosity, a generative medical event foundation model, can predict post-ED-discharge trajectories for adults with abdominal pain, differentiating the timing and severity of expected outcomes.

Design

Retrospective cohort study; encounters January 1–December 31, 2022; 30-day follow-up; analysis conducted in 2026.

Setting

Epic Cosmos research network (multicenter, population-based, de-identified electronic health record).

Participants

Adults (≥ 18 years) discharged from the ED with abdominal pain, excluding training-set patients. Random sample of 3,000 drawn from 150,030 eligible patients (65.3% female; median age 47 years [IQR 36–60]).

Exposure

ED discharge after evaluation for abdominal pain.

Main Outcomes and Measures

Primary: Curiosity model vs. per-task, separately estimated XGBoost models on area under the receiver operating characteristic curve (AUROC) for ED revisit ending in admission (admit-revisit), ED revisit ending in discharge (DC-revisit), and any ED revisit at 72 hours, 7 days, and 30 days. Secondary: trajectory-level accuracy across 36 trajectory classes and edit distance vs XGBoost; calibration of simulated vs observed conditional path probabilities across 45 transitions.

Results

Curiosity identified patients at high risk of revisit requiring admission more accurately than XGBoost and differentiated those likely to revisit without admission. Among 3,000 patients, Curiosity’s 30-day admit-revisit AUROC was 0.83 (95% CI 0.79–0.87) vs 0.70 (95% CI 0.65–0.75) for XGBoost (DeLong P<.001), and admit-revisit AUC-PR was 0.37 (95% CI 0.29–0.46) against a 4.1% cohort base rate, vs XGBoost 0.13 (95% CI 0.09–0.19). Curiosity identified the most likely trajectory out of 36 possibilities for 45.9% of patients (XGBoost 41.0%; McNemar P<.001), with median edit distance 1.28 vs 1.40 (Wilcoxon P<.001). Median absolute calibration error across 45 transitions was 1.30 percentage points (95% CI 0.32–2.49).

Conclusions and Relevance

A generative medical event foundation model produced calibrated trajectory-level predictions and discriminated admit-revisits more effectively than task-specific XGBoost baselines, separating patients that revisited and were admitted from those who revisited and were discharged.

Article activity feed