A Time-Aware Multilingual Multimodal Framework for Depression Detection on Social Media
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Around 280 million people across the world live with depression, making it one of the most common mental health concerns today bib42. Early detection is one of the most effective ways to support those in need and prevent their condition from worsening. Social media lets us see people's daily activities and feelings in today's digital world. More people express their genuine opinions online than in clinical settings. These websites are therefore helpful in learning about mental health trends. However, most previous studies have examined only English text and ignored the variety of languages and media people use on social platforms. This gap is evident in India, where users often write in Hinglish (a natural mix of Hindi and English), which brings new linguistic challenges. To bridge this gap, our study introduces a time-aware multilingual and multimodal framework for detecting signs of depression from social media posts collected from X (formerly Twitter). The model develops a deeper insight into how users behave and express themselves by learning from text, images, and posting times. The results of our experiments indicate that the model performs well, consistently across runs, with an F1-score of 0.79 and an AUC of 0.74, outperforming all text-only or single-language baselines. These results suggest that combining behavioural, visual, and textual cues improves the accuracy and cross-linguistic flexibility of depression detection. This is the first study to examine multilingual and multimodal depression detection using actual Indian social media data. This study shows the value of multicultural research and offers a valuable framework for developing tools that can facilitate online mental health monitoring.