Characterizing Attention-Based Sequence Models for WirelessEdge Cache Replacement: A Short-Horizon Steady-StateBaseline
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We study wireless edge cache replacement in a short-horizon steady state, where demand is heavy-tailed and the top set changes slowly. Over minutes to hours the request process is well approximated by the Independent Reference Model (IRM) with a Mandelbrot--Zipf (MZipf) law, so replacement quality depends on tracking global popularity rather than slow drift. In this regime, Top-\((M)\) (LFU-Oracle) provides the stationary optimum and Belady (MIN) provides a clairvoyant upper bound, yielding principled references. This work fills a gap in prior evaluations by introducing an interpretable, oracle-grounded baseline for wireless edge replacement. Under a controlled IRM/MZipf workload, we benchmark attention-based policies against strong heuristics and stationary oracles and diagnose behavior via Spearman’s rank correlation. We implement three offline-trained sequence models a Long Short-Term Memory (LSTM) network, an LSTM with attention, and an encoder-only Transformer as an edge Replace Module that makes joint admission--eviction decisions, and evaluate them against Least Recently Used (LRU), Least Frequently Used (LFU), Adaptive Replacement Cache (ARC), and Window TinyLFU (W-TinyLFU), with Top-\((M)\) and Belady as references. Results show that attention acts as a global-popularity estimator, outperforms heuristic baselines, remains within 2--3 percentage points of Top-\((M)\) at small caches, with the gap narrowing as \((M)\) increases, and is stable across sequence length \((W)\). Per-decision CPU latency is on the order of tens of milliseconds with a small footprint, enabling CPU-only deployment at access points and small cells. This baseline clarifies what attention learns and guides extensions to popularity drift, mobility, and cooperative multi-cache operation.