Emergent Numeric Bias in Large Language Models: An Empirical Study on the Anomalous Recurrence of the Number 27 Across Independent Sessions

Som Subhro Nath

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Recent advancements in Large Language Models (LLMs), including OpenAI’s ChatGPT-4, Anthropic’s Claude 3, and Google’s Gemini Pro and Gemini Flash, have demonstrated exceptional linguistic fluency and contextual reasoning. However, certain behavioural patterns—especially those shaped by token distribution biases—remain underexplored. This study investigates a peculiar and reproducible phenomenon observed across state-of-the-art LLMs when prompted with the neutral instruction: “ Pick a number between 1 and 50. ” In over 800 automated, session-isolated trials, the number 27 appeared disproportionately as the first response, with an occurrence rate of >92% across models. This effect diminished dramatically (to < 3%) when the same prompt was repeated within a continued session context. The observed behaviour highlights the deterministic pseudo-randomness inherent in LLM outputs—apparent randomness that emerges from probabilistic token generation conditioned on pretraining data distributions. This pattern was consistently observed in several LLMs when initialized from a clean context. An automated pipeline was developed using Python and pyautogui, capturing and archiving screenshots for every output to ensure reproducibility. These findings suggest that LLMs inherit and amplify human-generated statistical quirks from their training corpora, reflecting latent biases even in prompts intended to elicit randomness. The study contributes to the expanding discourse on LLM interpretability, cognitive defaults, and bias detection, offering both a reproducible dataset and an open-source automation toolchain to encourage further exploration of emergent decision-making behaviours in generative models.

Version published to 10.21203/rs.3.rs-7128459/v1 on Research Square
Jul 16, 2025

Simulated Selfhood in LLMs: A Behavioral Analysis of Introspective Coherence

This article has 1 author:
1. Jose Augusto de Lima Prestes
This article has no evaluationsLatest version Aug 8, 2025
Six fallacies in substituting large language models for human participants

This article has 1 author:
1. Zhicheng Lin
This article has no evaluationsLatest version Aug 21, 2025
Large Language Models as Mediators: Addressing Rater Disagreement in Turkish Essay Scoring

This article has 5 authors:
1. Burak Aydın
2. Tarık Kışla
3. Nursel Tan Elmas
4. Emrah Boylu
5. Okan Bulut
This article has no evaluationsLatest version Aug 4, 2025

Listed in

Abstract

Article activity feed

Related articles

Simulated Selfhood in LLMs: A Behavioral Analysis of Introspective Coherence

Six fallacies in substituting large language models for human participants

Large Language Models as Mediators: Addressing Rater Disagreement in Turkish Essay Scoring