Epigenetic machinery is functionally conserved in cephalopods

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

Background

Epigenetic regulatory mechanisms are divergent across the animal kingdom, yet these mechanisms are not well studied in non-model organisms. Unique features of cephalopods make them attractive for investigating behavioral, sensory, developmental, and regenerative processes, and recent studies have elucidated novel features of genome organization and gene and transposon regulation in these animals. However, it is not known how epigenetics regulates these interesting cephalopod features. We combined bioinformatic and molecular analysis of Octopus bimaculoides to investigate the presence and pattern of DNA methylation and examined the presence of DNA methylation and 3 histone post-translational modifications across tissues of three cephalopod species.

Results

We report a dynamic expression profile of the genes encoding conserved epigenetic regulators, including DNA methylation maintenance factors in octopus tissues. Levels of 5-methyl-cytosine in multiple tissues of octopus, squid, and bobtail squid were lower compared to vertebrates. Whole genome bisulfite sequencing of two regions of the brain and reduced representation bisulfite sequencing from a hatchling of O. bimaculoides revealed that less than 10% of CpGs are methylated in all samples, with a distinct pattern of 5-methyl-cytosine genome distribution characterized by enrichment in the bodies of a subset of 14,000 genes and absence from transposons. Hypermethylated genes have distinct functions and, strikingly, many showed similar expression levels across tissues while hypomethylated genes were silenced or expressed at low levels. Histone marks H3K27me3, H3K9me3, and H3K4me3 were detected at different levels across tissues of all species.

Conclusions

Our results show that the DNA methylation and histone modification epigenetic machinery is conserved in cephalopods, and that, in octopus, 5-methyl-cytosine does not decorate transposable elements, but is enriched on the gene bodies of highly expressed genes and could cooperate with the histone code to regulate tissue-specific gene expression.

Article activity feed

  1. Note: This rebuttal was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    1. General Statements [optional]

    This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

    We are grateful to the reviewers for their honest opinion regarding this work and plan to address the majority of the comments in a revised version either through new analysis or revision to the text, as we believe these will improve the manuscript by making some of the details clearer. There were few suggestions that will lead to substantiative changes to the findings. Here, we address the most salient critiques, the primary one being related to novelty.

    We respectfully disagree, as our detailed analysis of the DNA methylome in Octopus bimaculoides represents a significant advance to understanding how the epigenome is patterned in non-model invertebrates in general, and cephalopods in particular. We acknowledge that the previous report that the octopus methylome resembles the few other invertebrates where low DNA methylation has been found, the finding was part of a multi-organism study last year (de Mendoza et al., 2021), which lacked any detailed investigation. Our study provides the first in depth analysis on methylation patterning, the relationship with transposons and gene expression, and reports the finding of other key epigenetic marks in O. bimaculoides, and in other cephalopods.

    In short, we believe our study to be highly novel and that it represent the first analysis of this kind in cephalopods and one of the few existing in non-model invertebrate organisms. In addition, we identify the conservation of the histone code in cephalopods. While this may be expected, this is the first experimental evidence in this class and represents an important step forward to understand the epigenetic regulation of genes and transposons in invertebrates. Finally, we plan to provide an updated transcriptome annotation for O. bimaculoides that will be available for the scientific community as a new valuable resource. We believe these features will make this study highly cited.

    We believe that findings like ours will complement several recent studies that extend the epigenetics field out of the current narrow focus on model organisms to understand how epigenetic mechanisms function in diverse animals. This provides new insights regarding the epigenetic mechanism of gene regulation in an emerging invertebrate model.

    2. Description of the planned revisions

    Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

    Reviewer 1 raised the following points that we are planning to address:

    *- It is unclear why the authors did not use the original gene models of O. bimaculoides or tried to improve them. By only relying on adult tissue (but the relatively late hatchling stage), they would have omitted most developmentally expressed genes, that are incidentally also the ones that are subjected to extensive spatiotemporal gene regulation (which is also a problem to assess the role of methylation). I think more comparisons with existing gene models and how the newly generated stringtie models should be provided. *

    We agree that using as many tissues and developmental stages as possible will expand the octopus transcriptome.

    We plan to:

    • Add RNA-seq data from stage 15 embryos to improve this.
    • Compare the gene model used in the original version of the manuscript (Stringtie model to use in Trinotate for improving the annotation of the genes) to the existing annotation model and report on which has superior performance for annotating the * bimaculoides* transcriptome.
    • Extend the annotation of the transcriptome which we undertook in a focused fashion in the first iteration of this manuscript. Reviewer 2 raised the following points that we are planning to address:

    *- It is not exactly clear to me why the authors look for expression clusters in the first part of the manuscript? This information, while interesting, does not seem to be used in the methylation analysis. It is also somewhat contradictory because the authors first claim that, based on their GO-term enrichment analysis, that different expression clusters are associated with "complex regulatory mechanisms, potentially based in the epigenome". Yet at the end they conclude that, due to the global and tissue-overarching nature of methylation, this "argues against this epigenetic modification as a player in the dynamic regulation of gene expression". *

    We thank the reviewer for pointing out this issue and we plan to clarify the point through changing the text and additional analysis. Since we found that the methylation pattern was stable across tissues, and that it corresponded to gene expression levels regardless of tissues, we concluded that the methylation pattern is not likely relevant for the tissue-specific gene expression pattern reported in Figure 1.

    We plan to:

    • Ask whether there is a correlation between the gene clusters generated in Figure 1 and the DNA methylation patterns identified in Figure 4. *- At least for the trees that are shown in the main figures it would be great to show support values. *

    We thank the reviewer for this request.

    We plan to:

    • Add full Supplementary information regarding the support values in Supplemental Files for all the trees present in the main Figures. Reviewer 3 raised the following points that we are planning to address:

    *- It would be great to see more data on cephalopod TET and MBD structure. For example, it would be interesting to know whether octopus TETs have a CxxC domain or whether MBD proteins harbor functional 5mC - binding domains. *

    We agree that it would be of interest to examine the conservation of TET genes to expand upon the initial analysis by Planques et al 2021 showing that O. bimaculoides have one TET homolog, one MBD4 homolog and one MBD1/2/3 homolog. Detailed analysis of MBD4 protein has been already performed in de Mendoza et al. 2021 by using the protein sequence of O. vulgaris, as the MBD4 gene in the O. bimaculoides genome appears truncated.

    We plan to:

    • add the PFAM domain analysis for TET proteins This will be added as a new figure panel.
    • Update the text to include the reference to the identification of MBD4/MECP2 as the invertebrate homologs of vertebrate MBD4. *- Even though RRBS provides limited insight into DNA methylation patterns, the authors could have done more to explore read-level 5mC information. For example, by studying single reads, the authors could deduce the numbers of fully methylated, unmethylated or partially methylated reads. Such analyses might provide valuable insight into potentially different modes of epigenetic inheritance in different tissues i.e are there tissues that favor fully methylated or unmethylated stretches of DNA vs tissues that favor partial methylation? *

    We think this is a really interesting point. This has been partially addressed in a previous work (de Mendoza et al., 2021) which found limited to no partially methylated reads in whole-genome bisulfite sequencing from O. bimaculoides brain.

    We plan to:

    3. Description of the revisions that have already been incorporated in the transferred manuscript

    Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

    Reviewer 1 raised the following points that we have already addressed:

    We addressed all the comments raised by this Reviewer by revising the text, fixing references, typos and improving clarity.

    Reviewer 2 raised the following points that we have already addressed:

    We addressed all the minor Comments raised by this Reviewer regarding spelling errors and Supplementary Figures.

    - The finding that less than 10% of all possible sites are methylated is surprising. I could not (easily) find statistics of RRBS experiment read mapping to the genome.

    We have now provided this data and new Supplemental Table 1 (refereed in the text as Table S1).

    *- It is very exciting to see methylation of gene bodies and some correlation to their expression levels, but the authors may need to include a disclaimer that the methylation of TEs may go undetected due to the gapness of the genome. In fact, the authors may try to map their data onto a somewhat closely related Octopus sinensis genome sequenced with long reads available at NCBI to confirm overall pattern. It is likely though that due the evolutionary distance only gene bodies will have mapping. *

    The thank the reviewer for this suggestion and we included a sentence in the Result session indicating that methylation of TEs may go undetected due to the poor annotation of the octopus genome.

    *- The statistical reasoning (and methodology) behind how clusters in Figures 1 and 4 were defined is unclear. In particular, in Figure 4, it seems that the authors had asked the program to give four clusters in total - why was this number chosen? It seems that using the same generic clustering approach as in Figure 1 may benefit or confirm the results in Figure 4. *

    We clarified the rationale in the Material and Methods session to describe the bioinformatic analysis. We will put the full code used in the manuscript in our GitHub page (https://github.com/SadlerEdepli-NYUAD/) to have a more comprehensive understanding of the Method used.

    Reviewer 3 raised the following points that we have already addressed:

    We addressed all the minor comments in the text and figures raised by this reviewer regarding typos and clarity.

    *- There is little info on the generated 5mC data. To bolster its value as a resource, the manuscript should have a link to the table describing RRBS metrics. This should include: non-conversion rates, numbers of sequenced and mapped reads, read length and other info that the authors deem useful. *

    We have now provided this data in a new Supplemental Table 1 (refereed in the text as Table S1).

    4. Description of analyses that authors prefer not to carry out

    Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

    Reviewer 1 raised the following points that we are not planning to address:

    *- The newly sequence RNA-seq samples are using a ribodepletion protocol (RiboZero) while the other ones are using a polyA selection. This might be a slight problem to compare them quantitatively. Actually in the Figure 1, all 4 newly generated samples group together in the hierarchical clustering. *

    We acknowledge the reviewer’s point here and agree that heterogeneity in library prep and batch is a common issue when comparing public available with newly generated datasets. This could account for the clustering of the Ribosomal RNA depleted (i.e. RiboZero) from polyA selected RNA libraries. While this could potentially introduce bias, we do not believe that it substantially alters any of the main findings or the interpretations of this data. Our purpose for carrying out the cluster analysis of transcriptomic data from multiple tissues was to identify distinct gene patterns that defined different tissue types. This was accomplished regardless of the potential confounding variable introduced by different library preparations. In addition, we used TMP which seems to help in the comparison across different samples when used for qualitative analysis such as PCA and cluster analysis (Zhao et al. 2020; DOI: http://www.rnajournal.org/cgi/doi/10.1261/rna.074922.120). Therefore, even if not ideal we think that this approach is still valuable.

    *- I am not so sure about the way the authors used z-score normalized logTPMs and applied hierarchical clusters, this most likely would not fully alleviate the impact of expression level on the outcome compared to more advanced form of normalization and clustering. *

    We agree with the reviewer that applying z-score or a logTPMs normalization would not fully resolve the technical variance in the direct comparison of libraries generataed with different RNA selection methods. We did not apply z-score on logTPMs but these 2 methods were applied separately: z-score on TPMs in Figure 1B to define the gene clusters and log2(TPM+1) in Figure 4E. We have clarified the text to reflect this.

    *- I am not convinced that differences in western blot for histone modification could really provide a clear insight into their regulatory role. *

    We agree with the reviewer that Western blotting for histone modifications does not provide deep insight into their regulatory role. However, this is the first description of these marks in any cephalopod, and we believe that reporting a finding from experimental evidence is important, even if the result is aligned with the existing paradigm. Moreover, the marked difference in levels of distinct histone marks across tissues supports the hypothesis that they play a regulatory role. We observed this in mice where difference abundance in western blot correspond to different abundance and enrichment also by ChIP-seq (Zhang et al., 2021 DOI: https://doi.org/10.1038/s41467-021-24466-1). Considering the limited tools available in this species, we still consider this an important finding.

    Reviewer 2 raised the following points that we are not planning to address:

    *- The finding that less than 10% of all possible sites are methylated is surprising. I could not (easily) find statistics of RRBS experiment read mapping to the genome. I also wonder how much the gap-richness of the genome may affect the overall methylation estimate. If assembly permits, would it make sense to limit the sampled sites to areas where no flanking gaps are present (and sufficient scaffold length is available, maybe excluding very short scaffolds)? *

    We added all the statistical values regarding the RRBS in a NEW Supplemental Table 1. We used a single base pair analysis approach (not tiling windows), so the data we extracted is not biased by the length of the scaffolds. This is confirmed by the fact that the DNA methylation value obtained in our RRBS data matches the findings observed in Whole Genome Bisulfite Sequencing (WGBS). Moreover, global DNA methylation values assessed by Slot blot analysis as a technique independent from genome assembly confirmed what observed with RRBS.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #3

    Evidence, reproducibility and clarity

    The manuscript by Macchi et al describes the epigenome and the transcriptome of Octopus bimaculoides. While the manuscript itself is well written and the data are properly analyzed, it is fair to say that the work itself offers little biological novelty. Nevertheless, I still believe that the datasets and some of the analyses could be useful to researchers studying invertebrate epigenomes and gene regulation.

    1.  It would be great to see more data on cephalopod TET and MBD structure. For example, it would be interesting to know whether octopus TETs have a CxxC domain or whether MBD proteins harbor functional 5mC - binding domains. 
      
    2. There is little info on the generated 5mC data. To bolster its value as a resource, the manuscript should have a link to the table describing RRBS metrics. This should include: non-conversion rates, numbers of sequenced and mapped reads, read length and other info that the authors deem useful.
    3. Even though RRBS provides limited insight into DNA methylation patterns, the authors could have done more to explore read-level 5mC information. For example, by studying single reads, the authors could deduce the numbers of fully methylated, unmethylated or partially methylated reads. Such analyses might provide valuable insight into potentially different modes of epigenetic inheritance in different tissues i.e are there tissues that favor fully methylated or unmethylated stretches of DNA vs tissues that favor partial methylation?

    Minor comments:

    There are a few spelling errors throughout the manuscript. Please check for those: Figure 4F ("Trascrips" instead of transcripts), Schmedtea instead of Schmidtea. There are likely other errors as well.

    Page 3 - "intergenome"sounds a bit weird.

    The authors might consider citing Planques et al, 2021 (BMC Biol) alongside Mendoza et al when discussing unusually high 5mC levels in the sponge.

    Significance

    The main points of the paper are: i) a somewhat improved transcriptome, ii) DNA methylation data generated by RRBS that follows a canonical invertebrate pattern (low 5mCG levels present in GBs and absent from repeats), and iii) evolutionary analyses of epigenetic machinery components. While lacking biological novelty, the presented data have a resource value and could likely serve as a decent starting point for further exploration of cephalopod gene regulation. I therefore believe that with some revision the manuscript will merit publication in one of the Review Commons - associated journals.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    The paper by Macchi et al studies DNA methylation patterns in Octopus bimaculoides, describing overall conservation of DNA methylation machinery and genome-wide methylation patterns and their effect on gene expression across broad tissue sampling. As such, the paper comrpises a key advancement in the emerging field of cephalopod (epi)genomics and gene regulation. Despite the difficulties relating to the genome assembly of O. bimaculoides, the authors have done a solid analysis of methylation patterns and the results look generally sound. I have a few points that may help the authors improve their manuscript:

    • The finding that less than 10% of all possible sites are methylated is surprising. I could not (easily) find statistics of RRBS experiment read mapping to the genome. I also wonder how much the gap-richness of the genome may affect the overall methylation estimate. If assembly permits, would it make sense to limit the sampled sites to areas where no flanking gaps are present (and sufficient scaffold length is available, maybe excluding very short scaffolds)?
    • It is not exactly clear to me why the authors look for expression clusters in the first part of the manuscript? This information, while interesting, does not seem to be used in the methylation analysis. It is also somewhat contradictory because the authors first claim that, based on their GO-term enrichment analysis, that different expression clusters are associated with "complex regulatory mechanisms, potentially based in the epigenome". Yet at the end they conclude that, due to the global and tissue-overarching nature of methylation, this "argues against this epigenetic modification as a player in the dynamic regulation of gene expression".
    • It is very exciting to see methylation of gene bodies and some correlation to their expression levels, but the authors may need to include a disclaimer that the methylation of TEs may go undetected due to the gapness of the genome. In fact, the authors may try to map their data onto a somewhat closely related Octopus sinensis genome sequenced with long reads available at NCBI to confirm overall pattern. It is likely though that due the evolutionary distance only gene bodies will have mapping.
    • At least for the trees that are shown in the main figures it would be great to show support values.
    • The statistical reasoning (and methodology) behind how clusters in Figures 1 and 4 were defined is unclear. In particular, in Figure 4, it seems that the authors had asked the program to give four clusters in total - why was this number chosen? It seems that using the same generic clustering approach as in Figure 1 may benefit or confirm the results in Figure 4.
    • In the discussion Scmidtea is misspelled.
    • Some supplementary figures have to be exported as spell checker highlights are still present (e.g., in Suppl Fig 4).

    Significance

    This manuscript is an important step towards understanding the workings of gene regulation at the epi-genomic level in octopus and cephalopods in general

  4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    This manuscript focuses on the role of DNA methylation and histone modification in the gene regulation of cephalopods. It complements recently published RNA-seq and MethylSeq datasets with a few extra samples and generally confirms previous findings that DNA methylation does not play an active role in tissue or stage-specific regulation of gene expression in cephalopods (which is the general rule for most non-vertebrates). I don't see any methodological issue serious enough to preclude publication but some details should be strengthened.

    • the newly sequence RNA-seq samples are using a ribodepletion protocol (RiboZero) while the other ones are using a polyA selection. This might be a slight problem to compare them quantitatively. Actually in the Figure 1, all 4 newly generated samples group together in the hierarchical clustering.
    • It is unclear why the authors did not use the original gene models of O. bimaculoides or tried to improve them. By only relying on adult tissue (but the relatively late hatchling stage), they would have omitted most developmentally expressed genes, that are incidentally also the ones that are subjected to extensive spatiotemporal gene regulation (which is also a problem to assess the role of methylation). I think more comparisons with existing gene models and how the newly generated stringtie models should be provided.
    • I am not so sure about the way the authors used z-score normalised logTPMs and applied hierarchical clusters, this most likely would not fully alleviate the impact of expression level on the outcome compared to more advanced form of normalisation and clustering.
    • I am not convinced that differences in western blot for histone modification could really provide a clear insight into their regulatory role

    Significance

    This manuscript reports confirmatory results, partly reanalysing and confirming previous work. I would also like to stress that the methylation results have already been reported and discussed in a previous paper (de Mendoza et al. 2021). I don't have a fundamental problem with this but I also find the paper slightly overambitious and unspecific in its goals. I think it should benefit from being made slightly more concise. I find the part of histone marks is quite overstated. These marks are quite universal in eukaryotes and generally demonstrated to play a regulatory role, the fact that they can be detected in cephalopods by western blot is therefore not really a result.

    Comments on the text (difficult without line numbers):

    • Intro, first section: it would be good to have a few more references
    • "While this has been extremely fruitful in elucidating detailed mechanisms of epigenome patterning, regulation and function, they do not provide a comprehensive understanding of the multiple and varied ways that the epigenome functions." -> sentence is quite confusing and without very clear meaning
    • "In contrast, the most common invertebrate model organisms - Caenorhabditis elegans and Drosophila melanogaster - lack DNA methylation entirely. " -> could sound like this is the case for more invertebrates.
    • P4 "This is the case in many animals, " -> give examples, it is unclear which examples of TE control by methylation outside vertebrates have been corroborated by data. The paper cited do not deal with methylation in squid
    • Evolution has selected for variations in the canonical patterns of methylation -> such explanation could also be consistent with neutralistic explanations
    • p19: Schmidtea (type) "such as the planarian Schmedtea mediterranea"