Standard Attention as a Small-Angle Limit of Riemannian Geometric Algebra Transformers

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Transformers implicitly assume a flat representational geome- try: similarity is Euclidean, composition is linear, and attention is a softmax of dot products. We show that these operations arise as a degenerate limit of a curved geometric operator family on the rotor manifold Spin(3), realized in the Clifford algebra Cl(3, 0). In the small-angle regime, geometric soft- max reduces to classical dot-product attention; yet, at depth, many small-angle steps accumulate curvature through Baker– Campbell–Hausdorff commutators, providing a precise sense in which standard Transformers approximate rotor dynamics. The paper focuses on this mathematical discovery and its struc- tural implications for interpretability; performance studies are a separate empirical program.

Article activity feed