Gene specificity landscapes for comparative transcriptomic analysis across tissues, cell types, and species

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This important study advances a new computational approach to measure and visualize gene expression specificity across different tissues and cell types. The framework is potentially helpful for improving the way gene expression specificity is defined across biological datasets, especially among single-cell datasets. The evidence supporting the method is generally solid, although further evaluation of the method's robustness and comparison to other approaches would strengthen the conclusions.

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Gene expression specificity is a biological parameter relevant for understanding the molecular basis of evolutionary constraints and tissue-selective pathogenesis. Many efforts have tried to quantify the degree of tissue specificity of individual genes. The growing availability of single-cell transcriptomic data greatly expands the context in which expression specificity can be assessed. We present a computational strategy to globally analyse and compare the specificity of genes and groups of related genes across different contexts. By representing expression profiles in terms of expression level-breadth (L-B) relationships, we are able to quantify and construct 2D landscapes that provide a globally consistent coordinate system to map specificity patterns. We characterize these landscapes at different levels of resolution and across species to demonstrate simple strategies for comparative transcriptomics. We use this approach to investigate the tissue, cell type, and neuronal specificity of human genes and generate reference specificity landscapes. Finally, by comparing the specificity of brain cell subtypes across 4 primate species we find that their degree of conservation mirrors evolutionary divergence times. Our analysis framework and data resources are available in the R package GeneSLand.

Article activity feed

  1. eLife Assessment

    This important study advances a new computational approach to measure and visualize gene expression specificity across different tissues and cell types. The framework is potentially helpful for improving the way gene expression specificity is defined across biological datasets, especially among single-cell datasets. The evidence supporting the method is generally solid, although further evaluation of the method's robustness and comparison to other approaches would strengthen the conclusions.

  2. Reviewer #1 (Public review):

    Summary:

    Bot et al. introduce GeneSLand, a computational framework to quantify and visualize gene expression specificity across diverse transcriptomic datasets. The method leverages expression level-breadth (L-B) relationships to construct multi-level specificity landscapes and derives metrics such as lbSpec and dRate to characterize gene specificity in a threshold-independent manner. The authors showed the applicability of the approach across bulk RNA-seq, single-cell datasets, and cross-species primate brain data, showing that specificity patterns captured by this approach reflect both tissue-specific expression and evolutionary distances. Overall, the framework represents an interesting and potentially useful contribution to the analysis of gene expression specificity.

    Strengths:

    (1) Introduces an original conceptual framework based on expression level-breadth relationships to characterize gene specificity.

    (2) Provides a threshold-independent approach that could overcome some limitations of classical specificity metrics.

    (3) Demonstrates the applicability of the framework across different biological datasets.

    Weaknesses:

    (1) The method relies on predefined binning thresholds for expression levels, and the sensitivity of the derived metrics to this parameter is not fully explored.

    (2) The advantages of lbSpec relative to established metrics could be more clearly shown with some biological examples.

    (3) The robustness of the framework with noisy datasets, small sample sizes, or lower sequencing depth is not well evaluated.

  3. Reviewer #2 (Public review):

    Summary:

    Bot & Davila-Velderrain present a new method to understand expression specificity, based on an analysis of the relation between expression level and breadth for each gene. They show that the method captures biological differences across organs, diverse cell types, and specific cell subtypes, for different biological processes and across species.

    Strengths:

    This manuscript addresses an important question in an original manner, and was a pleasure to read. The authors frame the question very clearly: gene expression is a complex trait, which can be summarized in an informative manner by its specificity. The method the authors propose (which I'll call "LB" in this review) has several attractive features, summarising different specificity profiles in a more nuanced manner than the widely used tau. They show convincingly that their method captures relevant biology at different scales. I especially appreciated the comparative analyses of specificity within broad cell types and within neuronal subtypes.

    Weaknesses:

    Surprisingly, while the method works well, the authors never compare it to the state-of-the-art. Thus, comments 1 and 2 are my only "major" comments.

    (1) In the Introduction, the authors should explain which shortcomings of existing methods motivate the development of a new one.

    (2) In the Results section, the authors should compare the results of LB with other methods, at least tau and Gini (which is conceptually quite similar to LB).

    (3) It would be good to show the sensitivity of LB to different numbers of bins.

    (4) The conservation of specificity across primates was already reported in Kryuchkova-Mostacci 2016 (https://doi.org/10.1371/journal.pcbi.1005274). But also see Dunn et al 2018 (https://doi.org/10.1073/pnas.1707515115) for criticism of this type of naive pairwise comparisons.