INSIGHTFUL: Insight Generation through Clinical Annotation, Analysis, and Modeling of Suicide-Related Factors towards Understanding and Lifesaving

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective

Suicide is a critical medical and public health challenge, particularly among individuals with mental illnesses in safety-net hospitals. To uncover insights about suicidality embedded in unstructured clinical notes, we propose to annotate, analyze, and model a corpus for suicidality understanding and lifesaving.

Methods

A multidisciplinary panel developed an annotation guideline to capture four key suicide-related factors: Suicidal Ideation (SI), Suicide Attempt (SA), Exposure to Suicide (ES), and Non-Suicidal Self-Injury (NSSI). We created an annotated corpus of 500 notes through a clinically validated annotation process and performed cohort analysis to characterize demographic and suicidal distributions. A large language model was deployed for automatic classification.

Results

The annotated corpus was created with a Cohen’s Kappa of 0.95 and further de-identified for data sharing. Most notes (79.4%) contained one (34.4%) or more (45%) suicide-related labels, with SI and SA co-occurrence as the most frequent combination (35.6%), which demonstrates significant overlap. The cohort was characterized with a mean age of 33.4, 51.7% male, and 75.8% singles. Prevalent stressors included unemployment (24.2%), homelessness (12.0%), limited healthcare access (5.4%), and legal challenges (5.0%). We identified four key insights to improve documenting suicidality, including implicitness, confliction, ambiguity, and definition coverage incompleteness. The baseline model achieved a micro-averaged F1 score of 0.70, demonstrating satisfying performance in multi-label classification.

Conclusion

The near-perfect inter-annotator agreement underscores the proposed annotation process and data quality. Cohort analysis highlights the distribution and documentation insights of suicidality. Data modeling demonstrates the potential of insight generation via AI-powered methods for mining large-scale clinical notes.

Article activity feed