Multimodal Speech and Text Models to Detect Suicidal Risks in Adolescents

Jiacheng Jin
Yuanfei Liu
Ying Dai

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Early detection of suicide risk in adolescents is crucial but faces challenges including stigma, reluctance to disclose suicidal thoughts, and limited accessibility of mental health resources. Traditional assessment methods may miss at-risk populations, particularly in community settings. This study aimed to explore whether multimodal analysis combining acoustic and linguistic features can improve prediction of suicide risk in adolescents.

Methods

Voice recordings and transcribed text from 600 Chinese adolescents (aged 10-18 years) were collected from 47 schools in Guangdong, China. Suicide risk labels were derived from the Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID). The dataset included three voice tasks: answering an open-ended question about emotional regulation, reading a standard passage, and describing a face with negative emotions. Features were extracted using pre-trained models (EMOTION2VEC for acoustic features, Paraformer for speech-to-text conversion, and Tongyi Qianwen’s text-embedding-v3 for text features). We applied various machine learning classifiers including Support Vector Machine, Multi-layer Perceptron, Random Forest, and XGBoost to develop both single-modal and multimodal prediction models. Front-end fusion (FF) and back-end fusion (BF) techniques were employed to combine acoustic and linguistic features.

Results

Fusion models combining both acoustic and linguistic features consistently outperformed individual models. The model with both front-end and back-end fusion achieved the highest performance with an accuracy of 0.73, precision of 0.70, recall of 0.80, and F1 score of 0.74. Front-end fusion alone achieved the highest Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.767. Models performed equivalently across age groups but significantly better in females (AUROC = 0.72) compared to males (AUROC = 0.46).

Conclusions

Multimodal analysis combining acoustic and linguistic features significantly improves predictive accuracy for adolescent suicide risk detection compared to single-modal approaches. This approach offers a promising method for early identification of at-risk adolescents in community settings, potentially enabling timely intervention. Further external validation with larger samples is needed to optimize these models for clinical application.

Version published to 10.1101/2025.07.31.25332367 on medRxiv
Aug 5, 2025

Identifying Suicide-Related Language in Smartphone Keyboard Entries Among High-Risk Adolescents

This article has 30 authors:
1. Paul Alexander Bloom
2. Isaac N Treves
3. David Pagliaccio
4. Isabella Nadel
5. Emma Wool
6. Hayley Quinones
7. Julia Greenblatt
8. Natalia Parjane
9. Katherine Durham
10. Samantha Salem
11. Esha Trivedi
12. Hanga Galfalvy
13. Nicholas B. Allen
14. Deanna Barch
15. Ashley Blanchard
16. David Brent
17. Lauren Chernick
18. Peter Dayan
19. Caroline Paige Hoyniak
20. Karla Joyce
21. Jaclyn Schwartz Kirshenbaum
22. Lilian Y. Li
23. Joan Luby
24. Giovanna Porta
25. Koustuv Saha
26. Stewart Shankman
27. Adela Schwartz
28. Soorya Ram Shimgekar
29. Jamie Zelazny
30. Randy Patrick Auerbach
This article has no evaluationsLatest version Sep 3, 2025
Reading Between the Signs: Predicting Future Suicidal Ideation from Adolescent Social Media Texts

This article has 5 authors:
1. Paul Blum
2. Enrico Liscio
3. Ruixuan Zhang
4. Caroline Figueroa
5. Pradeep K. Murukannaiah
This article has no evaluationsLatest version Aug 29, 2025
Reading Between the Signs: Predicting Future Suicidal Ideation from Adolescent Social Media Texts

This article has 5 authors:
1. Paul Blum
2. Enrico Liscio
3. Ruixuan Zhang
4. Caroline Figueroa
5. Pradeep K. Murukannaiah
This article has no evaluationsLatest version Aug 28, 2025

Listed in

Abstract

Background

Methods

Results

Conclusions

Article activity feed

Related articles

Identifying Suicide-Related Language in Smartphone Keyboard Entries Among High-Risk Adolescents

Reading Between the Signs: Predicting Future Suicidal Ideation from Adolescent Social Media Texts

Reading Between the Signs: Predicting Future Suicidal Ideation from Adolescent Social Media Texts