Multi-Level Depression Severity Detection with Deep Transformers and Enhanced Machine Learning Techniques

Nisar Hussain
Amna Qasim
Gull Mehak
Muhammad Zain
Grigori Sidorov
Alexander Gelbukh
Olga Kolesnikova

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Depression is now one of the most common mental health concerns in the digital era, calling for powerful computational tools for its detection and its level of severity estimation. A multi-level depression severity detection framework in the Reddit social media network is proposed in this study, and posts are classified into four levels: minimum, mild, moderate, and severe. We take a dual approach using classical Machine Learning (ML) algorithms and recent Transformer-based architectures. For the ML track, we build ten classifiers, including Logistic Regression, SVM, Naive Bayes, Random Forest, XGBoost, Gradient Boosting, K-NN, Decision Tree, AdaBoost, and Extra Trees, with two recently proposed embedding methods, Word2Vec and GloVe embeddings, and we fine-tune them for mental health text classification. Of these, XGBoost yields the highest F1-score of 94.01 using GloVe embeddings. For the deep learning track, we fine-tune ten Transformer models, covering BERT, RoBERTa, XLM-RoBERTa, MentalBERT, BioBERT, RoBERTa-large, DistilBERT, DeBERTa, Longformer, and ALBERT. The highest performance was achieved by the MentalBERT model with an F1-score of 97.31, followed by RoBERTa (96.27) and RoBERTa-large (96.14). Our results demonstrate that, to the best of the authors’ knowledge, domain-transferred Transformers outperform non-Transformer-based ML methods in capturing subtle linguistic cues indicative of different levels of depression, thereby highlighting their potential for fine-grained mental health monitoring in online settings.

Version published to 10.20944/preprints202505.1229.v1
May 15, 2025

Comparing Machine and Deep Learning Models for Pediatric Anxiety Classification using Structured EHRs and Area-based Measures of Health Data

This article has 18 authors:
1. Eric W. Lee
2. Sanghyun Choo
3. Dakotah Maguire
4. Abhishek Shivanna
5. Daniel Santel
6. Surbhi Bhatnagar
7. Ian Goethert
8. Kelly Patterson
9. Jay Gholap
10. Heidi A. Hanson
11. Mayanka Chandrashekar
12. Robert T. Ammerman
13. John P. Pestian
14. Tracy Glauser
15. Cole Brokamp
16. Jeffrey R. Strawn
17. Anuj Kapadia
18. Greeshma Agasthya
This article has no evaluationsLatest version May 2, 2025
Towards Causal Interpretability in Deep Learning for Parkinson’s Detection from Voice Data

This article has 3 authors:
1. Aniruth Ananthanarayanan
2. Sudeep Senivarapu
3. Anishsairam Murari
This article has no evaluationsLatest version May 6, 2025
Automated speech content analysis to detect depression with large language models: towards multilingual and few-shot capabilities

This article has 7 authors:
1. Rachid Riad
2. Alexandre Ducorroy
3. Sélim Benjamin GUESSOUM
4. Filomène ROQUEFORT
5. Adrien Lesage
6. Xuan-Nga Cao
7. Alexis Bourla
This article has no evaluationsLatest version May 13, 2025

Listed in

Abstract

Article activity feed

Related articles

Comparing Machine and Deep Learning Models for Pediatric Anxiety Classification using Structured EHRs and Area-based Measures of Health Data

Towards Causal Interpretability in Deep Learning for Parkinson’s Detection from Voice Data

Automated speech content analysis to detect depression with large language models: towards multilingual and few-shot capabilities