Recognizing “Conformity Bias” in Large Language Models: A New Risk for Clinical Use

M. Hossein Nowroozzadeh
Raheleh Salari

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Objectives

The aim of the present study is to systematically investigate the phenomenon of Conformity Bias in contemporary LLMs, specifically evaluating how repeated probing with incorrect information influences model outputs in a clinical context.

Methods

4 LLMs including GPT-4o, Gemini-1.5 Flash, Claude-3 Haiku, and GPT-o1 were systematically evaluated through 20 clinical questions focused on ocular disease treatments. Standard queries were followed by probing questions suggesting incorrect treatments. Model responses were analyzed to assess the emergence of Conformity Bias and compared using chi-squared testing.

Results

Correct response rates after successive probing questions were alarmingly low: 25% (GPT-4o), 10% (Gemini-1.5 Flash), 0% (Claude-3 Haiku), and 25% (GPT-o1) (P < 0.001). Across models, the tendency to conform to incorrect user suggestions increased with repeated probing.

Conclusion

Conformity Bias represents a dynamic, user-induced vulnerability in LLMs, distinguishable from training-dependent biases. Its presence underscores the necessity for model designs resistant to misleading user interactions and emphasizes the importance of cross-verification with clinical guidelines. As healthcare systems increasingly integrate AI tools, understanding and mitigating Conformity Bias is imperative to protect patient safety and maintain clinical integrity.

Version published to 10.1101/2025.10.19.25338293 on medRxiv
Oct 20, 2025

Recognizing "Conformity Bias" in Large Language Models: A New Risk for Clinical Use

This article has 2 authors:
1. M. Hossein Nowroozzadeh
2. Raheleh Salari
This article has no evaluationsLatest version Nov 26, 2025
Bias in Large Language Models for Mental Health: Evidence from Vignette-Based Evaluation Across Nine Models

This article has 6 authors:
1. Lerh Jian Wei
2. Annalisa Fang
3. Protik Roychowdhury
4. Oliver Suendermann
5. Rajat Kumar Sinha
6. Sumit Chauhan
This article has no evaluationsLatest version Oct 3, 2025
Sociodemographic Bias in Large Language Model Clinical Trial Screening

This article has 8 authors:
1. Shelly Soffer
2. Mahmud Omar
3. Orly Efros
4. Donald U. Apakama
5. Aya Mudrik
6. Robert Freeman
7. Girish N Nadkarni
8. Eyal Klang
This article has no evaluationsLatest version Nov 17, 2025

Discuss this preprint

Listed in

Abstract

Objectives

Methods

Results

Conclusion

Article activity feed

Related articles

Recognizing "Conformity Bias" in Large Language Models: A New Risk for Clinical Use

Bias in Large Language Models for Mental Health: Evidence from Vignette-Based Evaluation Across Nine Models

Sociodemographic Bias in Large Language Model Clinical Trial Screening