The Ghost in the machine has an American accent: Exploratory Evidence of Cultural Value Drift in Early GPT-3.

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Early large language models (LLMs) were released with minimal alignment, offering a rare view of how generative systems reframed the ethical values embedded in human texts. We examine outputs from a 2021 version of OpenAI’s base GPT-3, prompting it to summarise culturally diverse source materials (laws, political speeches, and philosophical works) and interpreting results through a descriptive, moral value pluralist lens. Where possible, we contextualise outputs with cross-national datasets such as the World Values Survey. We document recurring value drift: Australia’s firearm policy is recast as a threat to liberty; de Beauvoir’s feminist critique becomes gender-essentialist dating advice; and Merkel’s humanitarian appeal is reframed as immigration control. In contrast, multilateral documents (UN/UNESCO) exhibit greater value stability, suggesting consensus-crafted language can buffer against cultural mutation. We argue that these early behaviours (observed before extensive fine-tuning and safety layers) provide a historically important baseline for understanding how training distributions shape normative framing. Our contribution is twofold: (1) empirical evidence that value drift can invert or overwrite embedded values along predictable cultural axes, and (2) a pluralist, descriptive evaluation method that surfaces whose values dominate and when. We conclude with implications for culturally inclusive evaluation and alignment in contemporary LLMs.

Article activity feed