Towards Million-Token Context Windows: A Topology-Preserving Framework for Adaptive Transformer Sparsification

Anderson Santos

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The O ( N ² ) self-attention complexity limits Transformer scaling to million-token con- texts—a challenge mirrored in genomic networks where combinatorial explosion constrains pairwise comparisons. Hardware optimizations and linear-cost architectures provide partial relief but trade global associative recall for local efficiency. We introduce Reduced Interaction Sampling (RIS) is a stochastic framework that cuts computational costs by over 99% while preserving structural fidelity. Dense interaction matrices in real-world systems carry topological over-specification; RIS adaptively samples key relationships to preserve the structural skeleton. Validation on a social network with N ≈ 4 million nodes shows degree centrality preservation at ρ = 0.96, with 15% improvement over standard sparse attention in global hub detection. The framework offers a grounded route toward sustainable long-context language understanding and high-throughput biological network analysis.

Version published to 10.21203/rs.3.rs-8865445/v1 on Research Square
Feb 16, 2026

Sparse Projection Attention: A Computationally Efficient Framework for Long Sequence Modeling

This article has 3 authors:
1. Mehdi Chrifi Alaoui
2. Nour-eddine Joudar
3. Mohamed Ettaouil
This article has no evaluationsLatest version Apr 6, 2026
Text-Aware Contrastive Learning for Bridging Graph Components in a Joint Embedding Space

This article has 4 authors:
1. Ahnaf Farhan
2. Moqsadur Rahman
3. Monika Akbar
4. M. Shahriar Hossain
This article has no evaluationsLatest version Mar 26, 2026
ConsultChain: Progressive Context Distillation Across Heterogeneous LLM Fleets for Token-Optimal Inference

This article has 1 author:
1. Samuel Edusa
This article has no evaluationsLatest version Apr 13, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Sparse Projection Attention: A Computationally Efficient Framework for Long Sequence Modeling

Text-Aware Contrastive Learning for Bridging Graph Components in a Joint Embedding Space

ConsultChain: Progressive Context Distillation Across Heterogeneous LLM Fleets for Token-Optimal Inference