Leveraging Large Language Models to Develop an Interpretable Prediction Model for Postpartum Hemorrhage Prior to the Onset of Labor

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

To evaluate whether large language models (LLMs) applied to prenatal clinical notes can predict postpartum hemorrhage (PPH) prior to the onset of labor and to compare model performance across outcome definitions, including a novel intervention-based definition.

Methods

We conducted a retrospective cohort study of 19,992 deliveries within a large regional health network. Two outcome definitions for PPH were used: estimated or quantitative blood loss (EBL/QBL) extracted from clinical notes, and a clinical intervention-based definition (cPPH) incorporating transfusion, uterotonics, Bakri balloon, or hysterectomy. We evaluated three approaches for PPH prediction: (1) supervised machine learning using structured electronic medical record data; (2) direct prediction using a fine-tuned LLM applied to clinical notes; and (3) interpretable models using LLM-extracted features combined with structured data. Model performance was evaluated using area under the receiver operating characteristic curve (AUROC) on a temporally held-out test set.

Results

The LLM-based direct prediction model achieved the highest performance for both PPH definitions (AUROC 0.79–0.80), followed by interpretable models combining LLM-extracted features with structured data (AUROC 0.76–0.78). Models using only structured data performed worse (AUROC 0.65–0.71). The LLM-extracted features approach identified 47 significant predictors, including established risk factors such as multiple gestation and previous cesarean delivery. Demographic differences were observed between PPH definitions: mothers who met only the cPPH definition had lower gestational age and higher rates of cesarean delivery compared to those meeting only the EBL/QBL definition.

Conclusion

These findings highlight the potential of LLM-based approaches for enhancing PPH risk stratification, with the feature extraction method offering a promising balance between predictive performance and clinical utility. Integrating these methods into clinical workflows could improve early detection and guide targeted preventive interventions.

Article activity feed