Alignment-free integration of single-nucleus ATAC-seq across species with sPYce

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Changes in gene regulation largely contribute to differences in cellular identities and phenotypes between species. Single-nucleus assays for transposase-accessible chromatin with sequencing (snATAC-seq) are an efficient strategy to identify putative gene regulatory elements active in a given tissue at single-cell resolution, and have unprecedented potential to provide new insight into evolutionary divergence of regulatory programs. However, no dedicated framework exists to integrate and compare snATAC-seq data across species, whilst methods designed for single-cell gene expression data have serious limitations. Here, we present sPYce, a cross-species snATAC-seq integration method that relies on sequence composition similarities through k -mer histograms. In contrast to other approaches, sPYce does not require orthologous genes or genome alignments to anchor data from different species. Instead, it uses similarity in non-coding regulatory sequence motifs to uncover conserved cellular identities. sPYce can embed datasets from multiple species into the same mathematical space and permits further downstream analysis steps, such as dimensionality reduction, visualisation, clustering, cell type annotation transfer, and motif enrichment. We benchmarked sPYce against existing approaches on two publicly available datasets spanning more than 160 million years of evolution, demonstrating that it successfully uncovers conserved cellular programs whilst preserving biologically relevant species-specific differences. sPYce also implements a significance test for divergence of regulatory motifs between species. By comparing cerebellar development in mouse and opossum, we discovered regulatory divergence in granule cell differentiation programs, particularly driven by Nuclear Factor 1 (NF1). Our extensive evaluation suggests that sPYce is the first easy-to-use, alignment-free cross-species snATAC-seq integration approach, opening novel perspectives to compare gene regulatory evolution across species.

Article activity feed