SBDH-Reader: an LLM-powered method for extracting social and behavioral determinants of health from medical notes

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Introduction

Social and behavioral determinants of health (SBDH) are increasingly recognized as essential for prognostication and informing targeted interventions. While medical notes contain rich SBDH details, these are unstructured and conventional extraction methods tend to be labor intensive, inaccurate, and/or unscalable. The emergence of large language models (LLMs) presents an opportunity to develop more effective approaches for extracting SBDH data.

Materials and Methods

We developed the SBDH-Reader, an LLM-powered method to extract structured SBDH data from full-length medical notes through prompt engineering. Six SBDH categories were queried including: employment, housing, marital relationship, and substance use including alcohol, tobacco, and drug use. The development dataset included 7,225 notes from 6,382 patients in the MIMIC-III database. The method was then independently tested on 971 notes from 437 patients at UT Southwestern Medical Center (UTSW). We evaluated SBDH-Reader’s performance using precision, recall, F1, and confusion matrix.

Results

When tested on the UTSW validation set, the GPT-4o-based SBDH-Reader achieved a macro-average F1 ranging from 0.85 to 0.98 across six SBDH categories. For clinically relevant adverse attributes, F1 ranged from 0.94 (employment) to 0.99 (tobacco use). When extracting any adverse attributes across all SBDH categories, the SBDH-Reader achieved an F1 of 0.96, recall of 0.97, and precision of 0.96 in this independent validation set.

Conclusion

A general-purpose LLM can accurately extract structured SBDH data through effective prompt engineering. The SBDH-Reader has the potential to serve as a scalable and effective method for collecting real-time, patient-level SBDH data to support clinical research and care.

Article activity feed