Towards Million-Token Context Windows: A Topology-Preserving Framework for Adaptive Transformer Sparsification
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The O ( N 2 ) self-attention complexity limits Transformer scaling to million-token con- texts—a challenge mirrored in genomic networks where combinatorial explosion constrains pairwise comparisons. Hardware optimizations and linear-cost architectures provide partial relief but trade global associative recall for local efficiency. We introduce Reduced Interaction Sampling (RIS) is a stochastic framework that cuts computational costs by over 99% while preserving structural fidelity. Dense interaction matrices in real-world systems carry topological over-specification; RIS adaptively samples key relationships to preserve the structural skeleton. Validation on a social network with N ≈ 4 million nodes shows degree centrality preservation at ρ = 0.96, with 15% improvement over standard sparse attention in global hub detection. The framework offers a grounded route toward sustainable long-context language understanding and high-throughput biological network analysis.