Investigating toxicity and Bias in stable diffusion text-to-image models

Matthias Schneider
Thilo Hagendorff

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Text-to-image models are increasingly popular and impactful, yet concerns regarding their safety and fairness remain. This study investigates the ability of ten popular Stable Diffusion models to generate harmful images, including sexual, violent, and personally sensitive material. We demonstrate that these models respond to harmful prompts by generating inappropriate content, which frequently displays troubling biases, such as the disproportionate portrayal of Black individuals in violent contexts. Our findings demonstrate a complete lack of any refusal behavior or safety measures in the models observed. We emphasize the importance of addressing this issue as image generation technologies continue to become more accessible and incorporated into everyday applications.

Version published to 10.1038/s41598-025-12032-4
Aug 26, 2025
Version published to 10.21203/rs.3.rs-5746189/v1 on Research Square
May 2, 2025

Cross-Modal Bias Transfer in Aligned Video Diffusion Models

This article has 4 authors:
1. Yuki Nakamura
2. Kenji Sato
3. Ayaka Suzuki
4. Hiroshi Tanaka
This article has no evaluationsLatest version Jan 27, 2026
Dressing-up disinformation: the contextual presentation of lies

This article has 3 authors:
1. Akos Szegofi
2. Oana Stanciu
3. Christophe Heintz
This article has no evaluationsLatest version Jan 9, 2026
Bias In, Symbolic Compliance Out? GPT’s Reliance on Gender and Race in Strategic Evaluations

This article has 2 authors:
1. Tristan Botelho
2. Qingyang (Iris) Wang
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Cross-Modal Bias Transfer in Aligned Video Diffusion Models

Dressing-up disinformation: the contextual presentation of lies

Bias In, Symbolic Compliance Out? GPT’s Reliance on Gender and Race in Strategic Evaluations