Emergent Spatio-Semantic Structure in Large Language Model Embedding Spaces

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large Language Models (LLMs) are increasingly used in geospatial applications typically as generators of geographic text or as natural language interfaces to spatial data. Here, we explore whether LLM embedding spaces can instead function as geospatial representations that can be exploited directly. Using embeddings extracted from Airbnb property descriptions in London, we show that off-the-shelf LLM embeddings exhibit emergent spatial structure. We further demonstrate that a lightweight residual geo-adapter substantially sharpens this spatial signal, enabling approximate localisation even when explicit geographic references are removed, while preserving semantic relationships learned during LLM pre-training. These results suggest a path toward spatially explicit foundation models which operate over the spatio-semantic embedding space, rather than generated text.

Article activity feed