Examining the Content and Consistency of Suicidal Thoughts using Large Language Models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Suicidal thoughts and behaviors (STBs) are a leading cause of death. Although suicide research and clinical practice often rely on self-report ratings (e.g., “On a scale from 0–10, how strong is your urge to kill yourself right now?”), little is known about how individuals translate internal experiences into numbers. In two longitudinal studies, we used large language models (LLMs) to examine the content and consistency of self-reported suicide urges across numeric severity ratings, between people, and within people over time. Study 1 included adolescents with past-year STBs (N = 158; 1,209 responses) who completed open-ended prompts at baseline and one-month follow-up. Study 2 included young adults with a past-year STBS (Study 2; N = 202; 3,168 responses) who completed parallel prompts at baseline and one-week follow-up. We developed a two-stage topic-modeling pipeline with BERTopic and LLaMA-3.3-70B-Instruct to extract and optimize clinically meaningful themes from participants’ qualitative descriptions of suicidal urges. We identified 13 (Study 1) and 15 (Study 2) topics reflecting the content of suicidal thoughts (e.g., Active Suicidal Ideation, Family/Friends Reactions, Depressed/Exhausted), which varied across severity ratings. Within- and between-person consistency followed a U-shaped pattern, with the highest agreement at endpoints and greater variability and ambiguity in the middle of the scale. Together, our findings revealed remarkable heterogeneity in the interpretation of suicidal urge severity, and highlight the potential of LLMs as a scalable method for advancing our understanding of how people interpret and respond to self-report rating scales.