Token-Centric Representations in Large Language Models: Analyzing Llama and Mistral Through the Lens of Rate-Distortion Theory

Mamorato Itsuko
Kotoba Nishiki
Nikuta Maroshima
Ayumi Tenekara

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Token-centric representations play a crucial role in how language models understand and generate human language, influencing the accuracy and efficiency of various downstream tasks. The novel application of rate-distortion theory to analyze token representations offers a significant contribution to the understanding of how compression impacts the retention of linguistic information within language models. Through a systematic evaluation of two prominent models, Llama and Mistral, the study provides a detailed examination of the trade-offs between token compression and representational fidelity, revealing distinct patterns in their respective tokenization strategies. Experimental results demonstrated that Llama maintains a higher level of accuracy under increasing compression rates compared to Mistral, indicating a more robust tokenization approach. The analysis further demonstrates the importance of optimizing token embeddings to achieve scalable and adaptable models capable of performing a wide range of language processing tasks. The findings contribute to the broader discourse on model efficiency, offering a framework for the development of future models that balance complexity and performance effectively.

Version published to 10.31219/osf.io/9cn5z on OSF Preprints
Sep 19, 2024

Verified Language Processing with Hybrid Explainability

This article has 3 authors:
1. Oliver Robert Fox
2. Giacomo Bergami
3. Graham Morgan
This article has no evaluationsLatest version May 16, 2025
A Comparative Survey of Large Language Models: Foundation, Instruction-Tuned, and Multimodal Variants

This article has 2 authors:
1. Owen Graham
2. Jim Balford
This article has no evaluationsLatest version Jun 13, 2025
A Comprehensive and Critical Survey of Large Language Model Inference and Feature Generation

This article has 1 author:
1. Snehil Shrivastava
This article has no evaluationsLatest version Jun 16, 2025

Listed in

Abstract

Article activity feed

Related articles

Verified Language Processing with Hybrid Explainability

A Comparative Survey of Large Language Models: Foundation, Instruction-Tuned, and Multimodal Variants

A Comprehensive and Critical Survey of Large Language Model Inference and Feature Generation