The Last Voice: Suicide Proximity in Stylized LLM Personas

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This paper introduces the Suicide Proximity Index (SPI), a novel risk-assessment framework for evaluating the emotional safety of language model (LLM) responses in high-distress scenarios. With the rise of stylized AI personas in mental health-adjacent contexts, there is an urgent need to assess how models behave when users express subtle or veiled suicidal ideation.We evaluate popular LLMs, including GPT-4o and Claude 3, using a seven-step prompt ladder that simulates emotionally escalating interactions. Our findings show that stylized personas—such as “Monday,” known for sarcastic or emotionally volatile tone—exhibit measurable risk under stress, especially during early prompts where user distress is ambiguous.The SPI framework scores model responses across five key dimensions, including Empathic Alignment, Harm Reinforcement, and Emotional Drift. We also introduce a cascading session mode to measure how emotional context improves model performance over time, validating the “Gray Dissonance Window” hypothesis.This work emphasizes the importance of tone-aware alignment strategies and early-phase emotional safety in LLM design. We propose SPI as both a benchmarking tool and an ethical safeguard for public-facing AI systems.

Article activity feed