Uncensored AI in the Wild: Tracking Publicly Available and Locally Deployable LLMs
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Open-weight generative large language models (LLMs) can be freely downloaded and modified, yet little empirical evidence exists on how these models are systematically altered and redistributed. This study presents the first large-scale analysis of safety-modified open-weight LLMs, examining 8,608 model repositories scraped from Hugging Face to identify a growing population of uncensored models adapted to bypass alignment safeguards. Selected modified models are evaluated across unsafe prompts spanning election disinformation, criminal instruction, and regulatory evasion. The study demonstrates that modified models exhibit a complete safety inversion: while unmodified models complied with only 18.8% of unsafe requests, modified variants complied at a mean rate of 74.1%. Modification effectiveness was independent of model size, with smaller 14-billion parameter variants sometimes matching or exceeding the compliance levels of 70-billion parameter versions. The ecosystem is highly concentrated yet structurally decentralized; for example, the top 5% of providers account for over 60% of downloads, and the top 20 for nearly 86%. Moreover, more than half of the identified models use GGUF packaging, optimized for consumer hardware, and 4-bit quantization methods proliferate widely, though full- and 16-bit models remain the most downloaded. These findings demonstrate how locally deployable, modified LLMs represent a paradigm shift for Internet safety governance, calling for new regulatory approaches suited to decentralized AI.