Do AI Models Understand Mental HealthConversations? A Study of Subreddit Classification and Explainability
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Social media is a powerful tool also for discussing mental health. The conversations that take place in these spaces provide a unique insight into how users talk about the issue. This study uses state-of-the-art transformer models, BERT and MentalBERT, to classify Reddit posts about anxiety, depression, bipolar disorder and borderline personality disorder (BPD) in specialised subreddits. By assessing how well subreddit conversations align with their intended mental health focus, the analysis ensures that these communities are effectively serving their purpose as support spaces.Our classification models achieves an average accuracy of 82%, with MentalBERT slightly outperforming BERT. To ensure transparency, we use Local Interpretable Model-agnostic Explanations (LIME) to identify key linguistic patterns that influence the model predictions. The outcome reveals distinct language use across conditions: as examples, discussions in bipolar disorder subreddits often refer to mood instability, while BPD communities emphasise challenges in emotional regulation. By integrating classification with explainability, this study provides insights that can help mental health professionals understand trends in online discourses and support platform moderation efforts to promote more effective and supportive digital environments. This, in turn, can help to raise awareness and reduce stigma around mental health issues.