Learning to Model the World: A Survey of World Models in Artificial Intelligence

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

World models (WMs) provide a unified approach for modeling how environments evolve over time by learning predictive representations of states and observations. Recent advances in large-scale generative modeling and multimodal foundation models have substantially broadened their applicability across a wide range of interactive and multimodal domains; however, existing research remains fragmented across modeling paradigms, application domains, and evaluation protocols. This survey provides a systematic and in-depth review of WMs in artificial intelligence. Based on the world modeling paradigms of existing methods, we first categorize WMs into four branches with formal mathematical formulations: observation-level generative, latent space, reinforcement learning-based, and object-centric WMs. We further review a broad range of WM applications spanning robotics, autonomous driving, scientific discovery, game simulation, GUI-based agents, as well as interpretability and trustworthiness, and analyze benchmarks, new evaluation metrics, simulation platforms, and comparative results across WMs. Finally, we discuss key challenges, including long-horizon consistency, and generalization, and outline promising directions for future research. This survey provides an actively updated \href{https://github.com/JiahuaDong/Awesome-World-Models}{GitHub Repository} to track developments in WMs and aims to offer a unified reference for understanding, comparing, and advancing WMs.

Article activity feed