LatentRecurrentDepthLM: An Open-Source Framework for Recurrent-Depth Language Models with Controllable Test-Time Compute
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
LatentRecurrentDepthLM is a modular, production-ready open-source framework implementing a hybrid recurrent-depth language model that decouples effective reasoning depth from parameter count by iterating a single weight-shared block over a continuous latent state, enabling controllable test-time compute scaling without generating intermediate tokens or modifying model weights. Built in PyTorch with full Hugging Face Transformers compatibility, the framework provides end-to-end pipelines for dataset preparation, tokenization, training with randomized iteration depth and cosine scheduling, autoregressive generation with temperature and top-k sampling, and one-command Hub deployment via a custom PreTrainedModel subclass. This paper documents the software architecture, core algorithms, training and inference workflows, practical use cases, and comparisons with related tools, connecting the framework to recent advances in recurrent-depth and latent reasoning research to serve researchers, educators, and practitioners exploring parameter-efficient sequence modeling. The repository (codewithdark-git/LatentRecurrentDepthLM) and Hugging Face model checkpoint are released under the MIT license.