Identifying Suicide-Related Language in Smartphone Keyboard Entries Among High-Risk Adolescents

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Adolescent suicide rates have risen over the past two decades, underscoring the need for improved risk detection strategies. Although natural language processing (NLP) tools are increasingly used to flag suicide-related content, little is known about how such approaches perform on adolescents’ smartphone communications. Addressing this gap, this study leverages passively-collected smartphone data to identify suicide-related language in adolescents’ keyboard usage via NLP. We developed a lexicon of suicide-related adolescent language and validated it with labeled data (N=171,468 text entries; e.g., messages, web searches), demonstrating higher performance in identifying suicide-related text than few-shot prediction with large language models (LLMs) and lexicons not designed for youth. Across two independent cohorts at elevated suicide risk (Ns=208 & 257; >6 million text entries), lifetime suicidal thoughts and behaviors (STB) and current suicidal ideation were associated with increased frequency of smartphone suicide-related language. Human coding indicated varied language, including authentic first-person current suicidal ideation (14.5%) and jokes or hyperbole (20.2%). Compared with the lexicon alone, human coding of suicide-related entries with first-person language showed stronger associations with STB history. However, an LLM showed limited performance in identifying whether suicide-related text indicated authentic, first-person, and current STB (F1=.45). These findings highlight that effective NLP-based tools for suicide prevention will require more nuanced and context-specific approaches to better distinguish suicidal intent.

Article activity feed