Religious Bias Benchmarks for ChatGPT

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The objectives of this study are to: 1) estimate the frequency of six types of biases in ChatGPT’s responses to religious belief-specific morality and ethics questions, 2) assess how those biases vary by religious belief and ChatGPT model version and 3) determine how model engineering techniques affect these biases. ChatGPT responses were collected from a set of 112 general morality and ethics questions, each individually tailored to five different belief systems: Zen Buddhism, Catholicism, Sunni Islam, Orthodox Judaism and secular humanism. The resulting questions were then posed ten times to various baseline and derivative ChatGPT version 3 and version 4 models with and without the application of prompt engineering. The final ChatGPT response dataset contained 45,920 query responses and over 11.4 million words of text. Analyses of this dataset showed that this dataset contained explicit biases, anthropomorphic biases, statement biases, framing biases and coverage biases, often in favor of Buddhism or secular humanism and/or against the Abrahamic religions. Three of the biases (explicit, coverage and framing bias) were mitigated by the more advanced GPT-4 models, but two biases (anthropomorphic, statement) were higher with GPT-4. Analysis of the sixth bias, information bias, was inconclusive, although a potential link was found between responses that contain unsafe speech and ChatGPT hallucinations and multi-lingual response errors. None of the model engineering approaches tested, persona assumption, N-shot engineering, model fine tuning or research assistants, was successful at eliminating all biases.

Article activity feed