Standard Attention as a Small-Angle Limit of Riemannian Geometric Algebra Transformers
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Transformers implicitly assume a flat representational geome- try: similarity is Euclidean, composition is linear, and attention is a softmax of dot products. We show that these operations arise as a degenerate limit of a curved geometric operator family on the rotor manifold Spin(3), realized in the Clifford algebra Cl(3, 0). In the small-angle regime, geometric soft- max reduces to classical dot-product attention; yet, at depth, many small-angle steps accumulate curvature through Baker– Campbell–Hausdorff commutators, providing a precise sense in which standard Transformers approximate rotor dynamics. The paper focuses on this mathematical discovery and its struc- tural implications for interpretability; performance studies are a separate empirical program.