Induction of C 4 genes evolved through changes in cis allowing integration into ancestral C 3 gene regulatory networks

Abstract

C ₄ photosynthesis has evolved independently in over sixty lineages and in so doing repurposed existing enzymes to drive a carbon pump that limits the RuBisCO oxygenation reaction. In all cases, gene expression is modified such that C ₄ proteins accumulate to levels matching those of the photosynthetic apparatus. To better understand this rewiring of gene expression we undertook RNA- and DNaseI-SEQ on de-etiolating seedlings of C ₄ Gynandropsis gynandra , which is sister to C ₃ Arabidopsis. Changes in chloroplast ultrastructure and C ₄ gene expression were coordinated and rapid. C ₃ photosynthesis and C ₄ genes showed similar induction patterns, but C ₄ genes from G. gynandra were more strongly induced than orthologs from Arabidopsis. A gene regulatory network predicted transcription factors operating at the top of the de-etiolation network, including those responding to light, act upstream of C ₄ genes. Light responsive elements, especially G-, E- and GT-boxes were over-represented in accessible chromatin around C ₄ genes. Moreover, in vivo binding of many G-, E- and GT-boxes was detected. Overall, the data support a model in which rapid and robust C ₄ gene expression following light exposure is generated through modifications in cis to allow integration into high-level transcriptional networks including those underpinned by conserved light responsive elements.

This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/4683939.

The manuscript "Induction of C4 genes evolved through changes in cis allowing integration into ancestral C3 gene regulatory networks" by Singh and colleagues represent an important contribution to the field of comparative studies of C3 and C4 photosynthesis. This paper analyses the de-etiolation process in Gynandropsis gynandra as a proxy to catch photosynthesis genes whose expression is affected using time course RNA-seq and DNaseI-seq. This work take advantage of the C4 G. gynandra to make comparative analysis with the well-established model Arabidopsis thaliana with plenty of resources available. Both species belong to sister lineages and have already demonstrated its versatility in the past. The analysis is focused also in transcription factor genes and their potential regulatory networks associated with genes participating in C3 and C4 photosynthesis.

This work provides a transcriptional framework to study de-etiolation in G. gynandra and genomic and epigenomic insights on the gene regulatory networks underlying this process. The main finding is the cistrome of C4 genes in G. gynandra has likely diverged from its orthologs in Arabidopsis. Also, data generated during this work will benefit the community working on C3/C4 photosynthesis above the scope of this analysis.

The idea that genes associated with C4 photosynthesis required changes in cis-regulatory elements to be incorporated in different gene regulatory networks during the evolution of C4 photosynthesis in certain species is intuitive, as the authors mention during the discussion. Authors works towards this hypothesis. In general, the techniques used in Singh et al. are adequate and provide relevant evidence that support the hypothesis, but the phenomenon is not functionally demonstrated yet. Also, G. gynandra provide one version of a repeated history that occurs multiple times during flowering plant evolution. Results should be taken as they are, and avoid generalizations. There are other hypothesis on how transcriptional networks could be rewired to generate novel traits that are not explored during this work. For example, authors fully assumes that TF binding preference is the same in G. gynandra and A. thaliana. Other molecular mechanisms involved in TF evolution could operate too: TF binding affinity can also change, or new protein-protein interactions could be involved in the process too (eg. TF partners, chromatin remodelers). Also, the analysis focus most on photosynthesis "target" genes instead of a more holistic analysis.

There are experiment that could demonstrate it in planta using Arabidopsis for example. However, I understand these approaches (such as promoter exchange and CRIPSR manipulation) could be not trivial at all, and also very laborious and expensive. In any case, this paper provide interesting insights to keep working in that way. Also, further experimentation could fall out of the scope of the manuscript. In that sense, I would recommend authors to tone down some claims, starting from the title.

There are a few aspects that could help to improve the manuscript that I would like to highlight:

1) The introduction needs more background about why using G. gynandra. Even references from the own group are difficult to find. Particularly it could be important to make it clear, from literature or new analysis, that both species possess a similar set of orthologous genes required for C3 and Cu photosynthesis. The effect of gain and lose of genes is not trivial for this analysis. Is Arabidopsis suitable for this analysis? These aspects would help to focus more in the focus of this paper.

2) In the introduction: "Further analysis using an analogous dataset from Arabidopsis allowed us to compare the extent to which regulatory mechanisms are shared between the ancestral C3 and derived C4 systems". This phrase has a wording that could led unaware readers to think that C3 plants like Arabidopsis are representative of the ancestral C3 state. Both species are extant and equally derived from the ancestral state. It is only possible to "infer" the ancestral state from comparative analysis.

3) Regarding Figure 1, are time points used for G. gynandra developmentally equivalent to A. thaliana at a similar time point? Does it take a similar time to establish a fully functional apparatus in Arabidopsis? I think these aspects should be at least discussed in the text and eventually experimentally evaluated.

4) The names of clusters in Figure 2C are incorrect, with cluster 1 comprising two different clusters. Something similar occurs in Supplementary Figure 4. The way the rows are grouped could be arbitrary and it would not change the main message of the figure. But, if clusters are called to form the groups, the wording should be corrected.

5) Colour codes could be more friendly for the readers. For instance, in Figure 2c and d the colours different bur refers to the same thing. If I understood correctly, the colour code of Figure 4h is the opposite to Figure 2 in meaning. Highly ranked motifs should be yellow?

6) Regarding Figure 2 comparative analysis with A. thaliana, it is important to make it clear in the text that not all genes have a conserved expression pattern in de-etiolation. In fact, this idea was not formally tested. So, there are certainly genes that showed a conserved expression pattern and genes that probably not. It is expected that genes from both species present some degree of similarity on expression pattern, in that sense this is a kind of control that both datasets are comparable. However, for the purpose of this work, it is even more interesting the cases where the expression pattern differs too.

7) Figures 4e-f are fairly difficult to interpret. For example, it is not very clear the logic of this statement: "As the number of any motif can vary between species due to phylogenetic distance, we ranked motif enrichment in each species." The reason behind why using a non-parametric comparison between A. thaliana and G. gynanda, seems reasonable but not because of the reason stated. The number or density of DHS peaks is comparable in each species genome? Is there any normalization of the DHS peak width?

8) Also, in some axis only "the fifty most enriched motifs" were considered, but in other cases all motifs were taken into account for the ranking. Which is the reason to do that? Seems an odd decision and make difficult to appreciate the Figure 4e. I understand that some sort of linearity is expected, but this is not very well reflected in the figure. For example, the comparison with the G. gynandra C3 and C4 genes cistrome is "less similar", but visually, seems to follow a linear relationship. Which one is the "more similar"? I would encourage authors to find a different way to visualize the comparisons thus the take-home-message from the figure is more transparent to the readers. Also, the comparisons mentioned in the text could be formally tested using non-parametric tests.

9) At some point, the analysis is too general and fail to capture candidate genes that could be studied in more detail. Given that the most important claims of the papers are associated with evolution, the paper loss sight the comparative analysis with Arabidopsis at some points. For example, TCP genes that are induced in early stages of de-etiolation, are also induced in Arabidopsis?

10) In that sense, it would be informative to compare distribution of DHS across features with what was found in A. thaliana.

11) In the same sense, Figure 6 is not very informative. The presence of motif outside DHS or DGFs is not relevant. It would be more interesting to make a comparison with the orthologous genes in A. thaliana. C4 genes gain motifs in G. gynandra? These motifs exist in Arabidopsis but the chromatin is less open (eg. TF does not bind)? At this point of the manuscript, this kind of questions could help to evaluate the original hypothesis more deeply.

12) Explanations offered in the discussion section are really good and precise. It would not be hard to tone down claims across the text to align them with the discussion.

13) Finally, there are a few typos across the text that authors could correct.

Read the original source

Induction of C ₄ genes evolved through changes in cis allowing integration into ancestral C ₃ gene regulatory networks

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Excerpt

Genomic and functional insights into the AP2 transcription factor family reveal a key orchestrator of drought adaptation in cultivated sugarcane

Towards a quantitative view of the NLR gene family 4evolution in the genome space

Genome-Wide Identification and Expression Profiling of the NAC Transcription Factor Family in the Waterlogging-Tolerant Tree Magnolia sinostellata

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Excerpt

Related articles

Genomic and functional insights into the AP2 transcription factor family reveal a key orchestrator of drought adaptation in cultivated sugarcane

Towards a quantitative view of the NLR gene family 4evolution in the genome space

Genome-Wide Identification and Expression Profiling of the NAC Transcription Factor Family in the Waterlogging-Tolerant Tree Magnolia sinostellata