Unlocking Multi-Sample Differential Expression for Spatial Transcriptomics Data with TESSERA

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Spatial transcriptomics allows the unprecedented examination of gene expression levels at the resolution of spatially-situated single cells in a high-throughput manner. As the technology is adopted more broadly, studies frequently collect data from multiple tissue samples, which leads to unique challenges that traditional spatial statistical methods are not equipped to handle. In particular, factors that differ across samples, such as different coordinate systems, different numbers and types of cells, different underlying tissue architectures, among others, preclude the application of traditional methods to our new setting. In this work, we propose a novel method, TESSERA , based on a spatial generalized linear model, for analyzing multi-sample spatial transcriptomics count data. Importantly, we provide a mathematical and computational framework for efficient and scalable model fitting and statistical inference to accompany the specification of our model. Our method for fitting the model enables the estimation of a common set of fixed effects across samples. This allows us to address a variety of differential expression questions, such as identification of which genes are differentially expressed between conditions (e.g., diseases, treatments), while accounting for spatial correlation between cells within a sample. We benchmark our proposed method on simulated data and apply it to a spatial transcriptomics dataset of human kidney samples. We find that our method provides a hitherto nonexistent extension to the multi-sample setting while remaining competitive with or outperforming existing algorithms in the single-sample setting.

Article activity feed