Beyond Metrics to Methods: A Scoping Review of Large Language Models for Detection of Social Drivers of Health in Clinical Notes

Ahmed Farrag
Ahmed Soliman
Elham Hatef
Amie Goodin
Masoud Rouhizadeh

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objective

This scoping review aimed to map current applications of Large language models (LLMs) for extracting Social drivers of health (SDoH), benchmarks model performance across domains to define the state of the field, and evaluates methodological approaches to identify research gaps and guide clinical deployment.

Materials and Methods

We searched PubMed, Web of Science, Embase, Scopus, and IEEE Xplore for studies applying LLMs in the detection of SDoH. We applied a novel methodological framework integrating: (1) a hierarchical classification system for SDoH domains and LLM architectures; (2) a systematic approach for synthesizing performance metrics; and (3) a custom seven-domain instrument to assess the methodological rigor.

Results

Forty-two studies met inclusion criteria. Behavioral Factors had the highest median F1-score (0.87), while Health Care Access and Quality showed the lowest and most variability (median F1 = 0.59). Research was concentrated in the United States (85.7%) and private institutional datasets (69%), often focused on critical care populations (45.2%). Methodological assessment revealed that only 29% of studies provided annotation guidelines, 24% assessed fairness across demographic groups, and 21% validated models externally.

Discussion and Conclusion

The progress of using LLMs for SDoH extraction is limited by performance variability, weak methodological rigor in the conducted studies, and minimal attention given to fairness and generalizability. Methodological gaps include a lack of provided annotation guidelines, assessment of fairness, and external model validation. LLMs show strong potential for extracting SDoH from clinical text. However, to move forward, addressing the current limitations demands more standardized, transparent, and robust research.

Version published to 10.1101/2025.07.04.25330866v1 on medRxiv
Jul 5, 2025

Resampling Methods for Class Imbalance in Clinical Prediction Models: A Systematic Review and Meta-Regression Protocol

This article has 4 authors:
1. Osama Abdelhay
2. Adam Shatnawi
3. Hassan Najadat
4. Taghreed Altamimi
This article has no evaluationsLatest version May 20, 2025
Large Language Models in Portuguese for Healthcare: A Systematic Review

This article has 7 authors:
1. Andre Massahiro Shimaoka
2. Antonio Carlos da Silva Junior
3. José Marcio Duarte
4. Thiago Bulhões da Silva Costa
5. Ivan Torres Pisa
6. Luciano Rodrigo Lopes
7. Paulo Bandiera-Paiva
This article has no evaluationsLatest version May 22, 2025
Implementation of Large Language Models in Electronic Health Records

This article has 3 authors:
1. Maxime Griot
2. Jean Vanderdonckt
3. Demet Yuksel
This article has no evaluationsLatest version Jul 4, 2025

Listed in

Abstract

Objective

Materials and Methods

Results

Discussion and Conclusion

Article activity feed

Related articles

Resampling Methods for Class Imbalance in Clinical Prediction Models: A Systematic Review and Meta-Regression Protocol

Large Language Models in Portuguese for Healthcare: A Systematic Review

Implementation of Large Language Models in Electronic Health Records