Systematic investigation of imprinted gene expression and enrichment in the mouse brain explored at single-cell resolution

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Summary: The reviewers appreciated the effort to merge most of the available datasets to make a precise survey of the sites of imprinted gene expression and the great resource it could bring to the community. However, the reviewers also felt that the study suffered from methodological bias, was preliminary (no allelic information in particular), and that the conclusions did not go beyond previous reports. The general lack of citation of the name of imprinted genes made it difficult to judge whether conclusions were consistent among the different datasets. Highlighting specific imprinted genes would bring a clearer focus and narrative.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

Although a number of imprinted genes are known to be highly expressed in the brain, and in certain brain regions in particular, whether they are truly over-represented in the brain has never been formally tested. Using thirteen single-cell RNA sequencing datasets we systematically investigated imprinted gene over-representation at the organ, brain region, and cell-specific levels.

Results

We established that imprinted genes are indeed over-represented in the adult brain, and in neurons particularly compared to other brain cell-types. We then examined brain-wide datasets to test enrichment within distinct brain regions and neuron subpopulations and demonstrated over-representation of imprinted genes in the hypothalamus, ventral midbrain, pons and medulla. Finally, using datasets focusing on these regions of enrichment, we identified hypothalamic neuroendocrine populations and the monoaminergic hindbrain neurons as specific hotspots of imprinted gene expression.

Conclusions

These analyses provide the first robust assessment of the neural systems on which imprinted genes converge. Moreover, the unbiased approach, with each analysis informed by the findings of the previous level, permits highly informed inferences about the functions on which imprinted gene expression converges. Our findings indicate the neuronal regulation of motivated behaviours such as feeding and sleep, alongside the regulation of pituitary function, as functional hotspots for imprinting. This adds statistical rigour to prior assumptions and provides testable predictions for novel neural and behavioural phenotypes associated with specific genes and imprinted gene networks. In turn, this work sheds further light on the potential evolutionary drivers of genomic imprinting in the brain.

Article activity feed

  1. Reviewer #3:

    In this study, Higgs et al. apply a systematic and hierarchical approach to testing the enrichment of imprinted gene expression in (mostly) adult tissues, culminating in a survey at the single-cell and neuronal sub-type level, which the authors achieve by exploitation of now extensive single-cell gene expression datasets. Arguably, there are no great surprises in this analysis: it reinforces previous studies showing/suggesting an enrichment for imprinted genes in the brain, with functions in feeding, parental behaviour, etc. But, it is conducted in a rigorous manner and makes highly informed inferences about the expression domains and neuronal subtypes identified. This level of detail is beyond any previous survey, therefore, the study will provide an excellent resource (although the fine details of the specific neuronal sub-populations in which imprinted gene expression is enriched are likely to be of interest to specialists only). Having, at all levels of their analysis, access to two or more single-cell datasets provides an important level of confidence in the analysis and findings, although there are some discrepancies between the enrichments found in comparing any two datasets. Moreover, the findings will give more prominence to neuronal domains that have received less emphasis in functional studies, for example, the enrichment of imprinted genes within the suprachiasmatic nucleus implicating roles in circadian processes.

    Imprinted expression covers a range of allelic biases and we are still some way from really understanding what an allelic skew means in comparison to absolute monoallelic expression: biased expression in all cells in a tissue or a mosaic of mono- and biallelically expressing cells. So finding an imprinted gene expressed in a given cell type without knowing whether its expression is actually imprinted in that cell type is a problem. And certainly a significant proportion of more recently discovered brain-expressed imprinted genes seem to fall into a category or paternal bias rather than full monoallelic expression. The authors do acknowledge this caveat in their discussion (lines 491-499). Is it possible to stratify the analysis according to degree of allelic bias? Ultimately, scRNA-seq using hybrid tissues will be important to resolve such issues. In this context, the authors will need to discuss findings in the very recently published paper from Laukoter et al. (Neuron, 2020), although that study focussed on cortical neurons in which Higgs and colleagues do not find imprinted gene enrichments.

    Another issue that could cloud the analysis, and particularly inference of how PEGs and MEGs could be involved in separate functions, is the issue of complex transcription units. The authors allude to Grb10 in which there are maternally and paternally expressed isoforms largely arising from separate promoters, which also applies to Gnas. There are also cases in which there are imprinted and non-imprinted isoforms. A problem with short-read RNA-seq libraries will be that much of the expression data for a given transcription unit cannot discriminate such differentially imprinted isoforms, as most of the reads mapping to the locus will map to shared exons. This caveat probably also needs to be mentioned in the text.

    The authors give some prominence to Peg3 as an example of the role of imprinted genes in maternal behaviours (e.g., line 508) as reported in the original knock-out (Li et al. 1999). However, this particular Peg3-knock-out associated phenotype has been questioned by a more recent Peg3 knock-out in which it was not observed (Denizot et al. 2016 PMID: 27187722), suggesting that the initial phenotype could be a consequence of the nature of the targeting insertion rather than Peg3 ablation.

    While a general picture that emerges is of imprinted genes acting in concert to influence shared functions (e.g., feeding), the authors also point out cases in which a single imprinted gene contributes to a neuronal function (Ube3a in the case of hippocampal-related learning and memory; line 511-512) but for which they did not find enrichment of imprinted genes in the relevant neuronal population. This poses some problems, but it could indicate that that particular function of the gene is not the function for which imprinting was selected if the gene is active in other domains, but is rather 'tolerated'. Of course, many imprinted genes will have multiple physiological functions, so the convergence on specific functions probably provides the best (but by no means perfect) basis for discerning the evolutionary imperatives.

  2. Reviewer #2:

    General assessment of the work:

    In this manuscript Higgs and colleagues test the hypothesis that imprinted gene expression is enriched in the brain, and that identifying specific brain regions of enrichment will aid in uncovering physiological roles for imprinted pathways. The authors claim that the hypothesis that imprinted genes are enriched in key brain functions has never been formally/systematically tested. Moreover, they suggest that their analysis represents an unbiased systems-biology approach to this question.

    In our assessment the authors fail to meet these criteria on several major grounds. Firstly, there are multiple instances of methodological bias in their analysis (detailed below). Secondly, the authors claim that their findings are validated by similar test results in 'matched' datasets. However, throughout the authors appear to have avoided identifying individual imprinted genes that are enriched in their analysis (they can be found in a minimally annotated supplementary file). Due to this it is impossible to judge to what extent there is agreement between matched datasets and between levels of the analysis. For these reasons the analysis appears arbitrary rather than systematic, and lacks rigor. Consequently we do not feel that the work of Higgs and colleagues goes beyond previous systematic reports of imprinting in the brain (for example, Gregg, 2010, Babak 2015, in ms reference list).

    Numbered summary of substantive concerns:

    1. Imprinted genes that were identified as enriched are not clearly named or listed

    -The authors use two or more independent datasets at each level to "strengthen any conclusions with convergent findings" (p4 ln96). By this the authors mean that both datasets pass the F-test criteria for enrichment. However, they should show which imprinted genes are allocated to each region, and clearly present the overlap. Are the same genes enriched in the two datasets? Similarly, are the same genes that are enriched in, e.g. the hypothalamus the same genes that are enriched in the ARC?

    -The authors discuss how their main aim of identifying expression "hotspots" helps inform imprinted gene function in the brain. An analysis of the actual genes is therefore crucial (and the assumed next step after identifying the location of enrichment).

    -The authors allocate parental expression enrichment to the brain regions but do not state why they do this analysis.

    -Are imprinted genes in the same cluster co-expressed, as might be expected?

    1. Selection of datasets needs to be more clearly explained (i.e. a selection criteria)

    -Their reason for selection "to create a hierarchical sequence of data analysis" - suggests that there could be potential bias in their selection based on previous knowledge of IG action in the brain.

    -A selection criteria would explain the level of similarity between datasets, which is important before datasets are systematically analyzed

    1. The study is more like a set of independent analyses of individual datasets (rather than one systematic/meta-analysis)

    -Each dataset was individually processed (filtered and normalized) following the original authors' procedure, rather than processing all the raw datasets the same way.

    -"A consistent filter, to keep all genes expressed in at least 20 cells or (when possible) with at least 50 reads" (p7 ln115), our emphasis - which filter was used? This should be consistent throughout.

    -Two different cut-offs were used to identify genes with upregulated expression, making the identification of enriched genes arbitrary (p7 para2).

    -Some datasets contain tissues from various time-points and sexes, but there is no clarification if all the data was included in the analysis. (e.g. the Ximerakis et al. dataset was originally an analysis of young and old mouse brains). This is particularly difficult to interpret when embryonic data is likened to adult data, which is in no way equivalent.

    -The cell-type and tissue-type identities were supplied by the dataset authors, based on their original clustering methods. This can be variable, particularly at the sub-population level.

    1. These differences make it hard to draw connections between the findings from each dataset

    -In some levels, the authors compare two datasets for a "convergence" of IG over-expression. Yet the above differences between datasets and analyses makes them difficult to compare. (e.g. the comparison of hypothalamic neuronal subtypes with enriched IG expression between two datasets in level 3.a.2 is quite speculative).

    -More generally, the authors draw connections between their findings from each level, but the lack of consistency between analyses may not justify these connections.

    1. Hence, the study does not lead to a definitive set of findings that is new to the field

    -The above reasons suggest that this is not an objective set of data about IG expression in the brain, but rather evidence of certain hotspots for targeted analysis. However, these hotspots were already known.

    -A systematic analysis of raw data using fewer datasets, that then includes and discusses the imprinted genes, may lead to novel findings and a paper with a clearer narrative.

  3. Reviewer #1:

    The authors studied the over-representation of imprinted genes in the mouse brain by using fifteen single-cell RNA sequencing datasets. The analysis was performed at three levels 1) whole-tissue level, 2) brain-region level, and 3) region-specific cell subpopulation level. Based on the over-representation and gene-enrichment analyses, they interpreted hypothalamic neuroendocrine populations and monoaminergic hindbrain neurons as specific hotspots of imprinted gene expression in the brain.

    Objective:

    Though the study is potentially interesting, the expression of imprinted genes in the brain and hypothalamus is already known (Davies W et al., 20005, Shing O et al. 2019, Gregg et al, 2010 including many other studies cited in the paper). However, the authors put forth two objectives, the first being whether imprinted gene expression is actually enriched in the brain compared to other adult tissues, where they did find brain as one of the tissues with over-represented imprinted genes. Secondly, whether the imprinted genes are enriched in specific brain regions. The study objectives cannot qualify as completely novel as it is the validation of most of what is already known using scRNA-seq datasets.

    Methods and Results

    Pros:

    -15 scRNA-seq datasets were analysed independently and they were processed as in the original publication.

    -Two enrichment methods used to find tissue-specific enrichment of imprinted genes and appropriate statistics applied wherever necessary.

    Concerns:

    -It is not clear how the over-representation using fisher's exact test was calculated? It would be appropriate to include the name of the software or R package, if used, in the basic workflow section of Materials and methods.

    -Why did authors particularly use Liger in R for GSEA analysis?

    -GSEA plots generated using Liger and represented for each analysis in the paper by itself does not look informative. For eg. in figure 4 and other GSEA plots in the paper- i) Which 'score' does the Y-axis represent? Include x-axis label and mention corrected GSEA q value either in the legend or the figure. ii) Was the normalized enrichment score (NES) calculated? What genes in the cluster represent maximum enrichment? A heat map of the imprinted genes contributing to the cell cluster will add more clarity to the GSEA plots.

    -Apart from the tissue-specific enrichment of gene sets, a functional GO/pathways enrichment of the group of imprinted genes will strengthen the connection of these genes with feeding, parental behavior and sleep.

    -Are these imprinted genes coexpressed across the analyzed brain structures, as the authors repeatedly stress on the functioning of imprinted genes as a group?

    -A basic workflow schematic might be necessary for an easy and quick understanding of the methods.

    Overall, the study gives some insight into the brain regions, particularly cell clusters in the brain where imprinted genes could be enriched. However, the nature of the study is preliminary and validates most of previous studies. The authors have already highlighted some of the limitations of the study in the discussion.

  4. Summary: The reviewers appreciated the effort to merge most of the available datasets to make a precise survey of the sites of imprinted gene expression and the great resource it could bring to the community. However, the reviewers also felt that the study suffered from methodological bias, was preliminary (no allelic information in particular), and that the conclusions did not go beyond previous reports. The general lack of citation of the name of imprinted genes made it difficult to judge whether conclusions were consistent among the different datasets. Highlighting specific imprinted genes would bring a clearer focus and narrative.