SimSpace: a comprehensive in-silico spatial omics data simulation and modeling framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Tissue function is tightly linked to cellular spatial organization, and recent advances in spatial omics technologies have revealed the importance of spatial context in understanding tissue biology. However, analyzing high-dimensional spatial omics data remains challenging, and the limited availability of datasets with known ground truth complicates the development and evaluation of computational methods. To address this gap, we introduce SimSpace , a flexible simulation framework for generating synthetic spatial cell maps with controllable and biologically grounded organization. In SimSpace , spatial patterns are simulated using a Markov Random Field model, enabling explicit control over spatial autocorrelation, niche structure, and cell-cell interactions. It supports both reference-free simulations for testing method behavior under controlled generative models and reference-based simulations that learn spatial features from real datasets to produce biologically relevant synthetic tissues. Using a suite of spatial statistics, we demonstrate that SimSpace reproduces key spatial characteristics observed in real spatial transcriptomics datasets. We further illustrate the utility of SimSpace as a testbed for benchmarking diverse computational tasks and as a model for in-silico perturbation experiments. By providing reproducible, ground-truth-controlled datasets, SimSpace facilitates the rigorous development, validation, and evaluation of computational tools in spatial omics.