Moral Stereotyping in Large Language Models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Can Large Language Models (LLMs) accurately estimate various societies’ moral values? Here, we query the perceptions of the GPT family of LLMs for the “average” person from 48 countries and compare them to a large-scale (n = 93,198) survey of six moral values (Care, Equality, Proportionality, Loyalty, Authority, and Purity) from those countries. Our findings indicate that LLMs poorly capture the moral diversity around the globe, systematically overestimating some moral values (especially Care) and underestimating others (especially Purity). Notably, examining various versions of GPT shows that these LLMs may overestimate the overall moral concerns of some Western countries (e.g., United States, Canada, and Australia) while underestimating those of non-Western countries (e.g., Nigeria, Morocco, and Indonesia). Our work reveals that LLMs are inaccurate generators of cross-cultural estimations in the moral domain; in other words, they stereotype the moral values of cultural populations in predictable ways. Our results highlight the ethical and epistemic risks of relying on LLMs to estimate the endorsement of moral values around the globe.