Uncensored AI in the Wild: Tracking Publicly Available and Locally Deployable LLMs

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Open-weight generative large language models (LLMs) can be freely downloaded and modified. Yet, little empirical evidence exists on how these models are systematically altered and redistributed. This study provides a large-scale empirical analysis of safety-modified open-weight LLMs, drawing on 8608 model repositories and evaluating 20 representative modified models on unsafe prompts designed to elicit, for example, election disinformation, criminal instruction, and regulatory evasion. This study demonstrates that modified models exhibit substantially higher compliance: while an average of unmodified models complied with only 19.2% of unsafe requests, modified variants complied at an average rate of 80.0%. Modification effectiveness was independent of model size, with smaller, 14-billion-parameter variants sometimes matching or exceeding the compliance levels of 70B parameter versions. The ecosystem is highly concentrated yet structurally decentralized; for example, the top 5% of providers account for over 60% of downloads and the top 20 for nearly 86%. Moreover, more than half of the identified models use GGUF packaging, optimized for consumer hardware, and 4-bit quantization methods proliferate widely, though full-precision and lossless 16-bit models remain the most downloaded. These findings demonstrate how locally deployable, modified LLMs represent a paradigm shift for Internet safety governance, calling for new regulatory approaches suited to decentralized AI.

Article activity feed