Evaluating genetic-ancestry inference from single-cell RNA-seq data
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Characterizing the ancestry of donors in single-cell RNA sequencing (scRNA-seq) studies is critical to ensure the genetic homogeneity of the dataset and reduce biases in analyses, to identify ancestry-specific regulatory mechanisms and understand their downstream role in diseases, and to ensure that existing datasets are representative of human genetic diversity. While scRNA-seq is now widely available, the information on the ancestry of the donors is often missing, hindering further analysis. Here we propose a framework to evaluate methods for inferring genetic-ancestry from genetic polymorphisms detected from scRNA-seq reads. We demonstrate that widely used tools (e.g., ADMIXTURE) provide accurate inference of genetic-ancestry and admixture proportions despite the limited number of genetic polymorphisms identified and imperfect variant calling from scRNA-seq reads. We inferred genetic-ancestry for 196 donors from four scRNA-seq datasets from the Human Cell Atlas and highlighted an extremely large proportion of donors of European ancestry. For researchers generating single-cell datasets, we recommend reporting genetic-ancestry inference for all donors and generating datasets that represent diverse ancestries.