Bias in Large Language Models for Mental Health: Evidence from Vignette-Based Evaluation Across Nine Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

With increasing use of large language models (LLMs) for mental health needs and reports of inappropriate and biased responses, it is important to identify determinants of bias in LLMs reasoning and responses. This study evaluated LLMs-generated responses to mental health vignettes varying in severity and nature of symptom for 10 social questions, such as comfort level as a working colleague and propensity for violence. Nine LLMs models (Deepseek, Gemini, Gemma, GPT‑3.5, GPT‑4, GPT‑4o, LLaMA, Microsoft, StabilityAI) were assessed using automated metrics, including BERTScoreF and ROUGE-L for differences from human-expert generated responses (degree of biasness). Analyses showed that models produced responses that differ lexically and semantically.. Moreover, the significant interaction effects of type of mental health symptom and severity across LLMs and type of social questions were indicative of weaker concordance of reasoning between LLMs and human-expert depending on specific symptom-severity, suggesting potential biases and differential generalization. Research and clinical implications, such as the importance of human expert oversight throughout the development and application of LLMs for mental health use, were discussed.

Article activity feed