Exploring the Potential of Large Language Models for Automated Safety Plan Scoring in Outpatient Mental Health Settings

Hayoung K. Donnelly
Gregory K. Brown
Kelly L. Green
Ugurcan Vurgun
Sy Hwang
Emily Schriver
Michael Steinberg
Megan Reilly
Haitisha Mehta
Christa Labouliere
Maria Oquendo
David Mandell
Danielle L. Mowery

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The Safety Planning Intervention (SPI) produces a plan to help manage patients’ suicide risk. High-quality safety plans – that is, those with greater fidelity to the original program model – are more effective in reducing suicide risk. We developed the Safety Planning Intervention Fidelity Rater (SPIFR), an automated tool that assesses the quality of SPI using three large language models (LLMs)—GPT-4, LLaMA 3, and o3-mini. Using 266 deidentified SPI from outpatient mental health settings in New York, LLMs analyzed four key steps: warning signs, internal coping strategies, making environments safe, and reasons for living. We compared the predictive performance of the three LLMs, optimizing scoring systems, prompts, and parameters. Results showed that LLaMA 3 and o3-mini outperformed GPT-4, with different step-specific scoring systems recommended based on weighted F1-scores. These findings highlight LLMs’ potential to provide clinicians with timely and accurate feedback on SPI practices, enhancing this evidence-based suicide prevention strategy.

Version published to 10.1101/2025.03.26.25324610v1 on medRxiv
Mar 27, 2025

Large Language Models as Mental Health Resources: Patterns of Use in the United States

This article has 4 authors:
1. Tony Rousmaniere
2. Xu Li
3. Yimeng Zhang
4. Siddharth Shah
This article has no evaluationsLatest version Mar 18, 2025
Explainable Suicide Phenotyping from Initial Psychiatric Evaluation Notes Using Reasoning Large Language Models

This article has 8 authors:
1. Zehan Li
2. Wanjing Wang
3. Lokesh Shahani
4. Rodrigo M Vieira
5. Salih Selek
6. Jair Soares
7. Hongfang Liu
8. Ming Huang
This article has no evaluationsLatest version Mar 28, 2025
Harnessing AI for Patient Engagement in a Study on Large Language Models and Open Notes

This article has 8 authors:
1. Dana Lewis
2. Liz Salmi
3. Jennifer Clarke
4. Zhiyong Dong
5. Rudy Fischmann
6. Emily I. McIntosh
7. Chethan R. Sarabu
8. Catherine M. DesRoches
This article has no evaluationsLatest version Mar 26, 2025

Listed in

Abstract

Article activity feed

Related articles

Large Language Models as Mental Health Resources: Patterns of Use in the United States

Explainable Suicide Phenotyping from Initial Psychiatric Evaluation Notes Using Reasoning Large Language Models

Harnessing AI for Patient Engagement in a Study on Large Language Models and Open Notes