The proteotranscriptomic characterization of venom in the white seafan Eunicella singularis elucidates the evolution of Octocorallia arsenal

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

All the members of the phylum Cnidaria are characterized by the production of venom in specialized structures, the nematocysts. Venom of jellyfish (Medusozoa) and sea anemones (Anthozoa) has been investigated since the 1970s, revealing a remarkable molecular diversity. Specifically, sea anemones harbour a rich repertoire of neurotoxic peptides, some of which have been developed in drug leads. However, venoms of the vast majority of Anthozoa species remain uncharacterized, particularly in the class Octocorallia. To fill this gap, we applied a proteo-transcriptomic approach to investigate the venom composition in Eunicella singularis , a gorgonian species common in Mediterranean hard-bottom benthic communities. Our results highlighted the peculiarities of the venom of E. singularis with respect to sea anemones, which is reflected in the presence of several toxins with novel folds, worthy of functional characterization. A comparative genomic survey across the octocoral radiation allowed us to generalize these findings and provided insights into the evolutionary history, molecular diversification patterns and putative adaptive roles of venom toxins. A comparison of whole-body and nematocyst proteomes revealed the presence of different cytolytic toxins inside and outside the nematocysts. Two instances of differential maturation patterns of toxin precursors were also identified, highlighting the intricate regulatory pathways underlying toxin expression.

Article activity feed

  1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    Manuscript number: RC-2024-02535

    Corresponding author(s): Modica, Maria Vittoria

    1. General Statements [optional]

    We are grateful to the reviewers for their detailed evaluation and insightful comments on our manuscript, which has led us to introduce several clarifications, expand a few issues initially underscored, and amend some incongruencies.

    We have been able to incorporate changes to reflect most of the suggestions provided by the reviewers, as highlighted in the main text. Most of the additional analyses proposed by the reviewers were carried out, in some cases providing interesting insights that were included in the manuscript, while in others revealed not conclusive, as detailed below.

    We believe that the congruence and readability of the manuscript has been overall improved, and we are confident that our responses align with the level of detail required by the reviewers

    2. Point-by-point description of the revisions

    Reviewer #1 (Evidence, reproducibility and clarity (Required)):

    Summary: *The manuscript by Modica et al reports characterisation of the venom system in the white sea fan Eunicella singularis, a species of an octocorallian coral. E. singularis is common in the north-western Mediterranean sea. The authors used a proteo-transcriptomic approach followed by extensive bioinformatics analysis. Specifically, they generated a new *E. singularis *transcriptome and characterised extracts from nematocysyts (venom-bearing structures) and whole body using tandem mass spectrometry. Toxins were identified by HMMER using Tox-prot and VenomZone databases as queries as well as ClanTox web server.

    Major comments:

    As far as I am aware, venom production by ectodermal gland cells has been reported only in sea anemones (Moran et al, 2011), therefore it is unclear whether it is the case in the octocorallian sea fan as well. Additionally, cnidarian toxin-like proteins might be produced by neurons (Sachkova et al, 2020) or involved in development (Surm et al 2024). Thus, it is probable that in E. singularis not all the toxin-like proteins found in the whole body proteome and missing from the nematocyst proteome are venom components. Thus, additional experiments would be required to localise those proteins to ectodermal gland cells. I suggest to mention this limitation and refer to such proteins as "toxin-like" or "putative toxins".

    We thank the Reviewer for this observation, which is indeed correct. We have modified the text according to this suggestion and we have added a cautionary statement to the analysis section.

    In addition to submitting proteomics data to PRIDE, it would be helpful for readers/reviewers to provide a supplementary excel file with all the peptides and proteins identified by PEAKS Studio. I could not access the data on PRIDE as I think they still have not been assigned a PXD dataset identifier.

    Excel files with both proteomes have now been provided as supplementary material (Suppl tab. 2 and 3).

    *Minor comments:

    It would be helpful for readers to split the Results and Discussions into smaller subsections with headings, perhaps according to the identified toxin families. It would be also helpful to provide a summary figure with all the toxins identified and perhaps toxin expression levels. Especially showing cysteine patterns for new toxins would be very useful.

    Wherever possible, Results and Discussions were split into subsections according to toxin families, following reviewer’s suggestion.

    Figure 2.C summarizes the identified toxin families along with the number of validated sequences for each of them. We provided an excel file with the sequences and expression levels of the identified toxins as supplementary table 2. We have now added a column with cysteine patterns to better define and characterize these toxins

    It is unclear why the Toxin annotation pipeline is hidden in the supplementary material. It would be also helpful to show it as a schematic pipeline in the main text.

    We have prepared a figure describing the annotation pipeline that is now provided as Fig.1 in the main text.

    The identification of proteolytic cleavage sites is not really described. It would be also helpful to mark them at the Figure 2.

    We have adjusted the Methods section in the Supplementary Material to give a clearer explanation of the methods applied to identify putative cleavage sites. The figure (now Fig. 3) has been adjusted to include the protease recognition site.

    "Other peptides present in E. singularis nematocysts and displaying protease inhibitory domains, but likely lacking a toxin function (Kazal-type, cystatines, antistasins, and macins)..." - why do they likely lack a toxin function? what is the rational behind this statement?

    • *While we were referring to a strictly neurotoxic function, the statement is indeed misleading and was removed from the amended text and modified as follows “Other peptides present in E. singularis nematocysts displaying protease inhibitory domains (Kazal-type, cystatines, antistasins, and macins) were detected but did not present novelty elements. Their sequences are described in supplementary data.”

    "cell- or tissue-specific differential maturation patterns" - I think the differential maturation needs to be confirmed by additional experiments to exclude a possibility of being an artifact due to low mass spectrometry sensitivity.

    This is indeed true. Nonetheless, our proteomic analyses provided quite convincing evidence of this phenomenon. Figure 3 in the manuscript summarizes the output of our PEAKS studio analyses, but for clarity we reported as Suppl. Fig. 1 the original output for the identification of U-GRTX-Esi2a/b.In the figure, each blue line below the precursor sequence denotes a peptide that was confidently identified by LC-MS/MS. As visible, several peptides were identified for this protein in either proteome, but there is a clear pattern pointing toward the complete absence of the first domain in the NEM-P. The Reviewers have rightfully raised concerns that, given the ethanol extraction protocol employed, our NEM-P may be partial and/or contaminated by other extracted proteins. This is true, and in fact we have added cautionary statements throughout the text. It is reasonable to assume, though, that proteins with similar sequence and physicochemical features, like U-GRTX-ESI-2a and 2b, will respond similarly to the ethanol extraction procedure. If present, we believe the first domain (U-GRTX-ESI-2a) should have produced some detectable peptide also in the NEM-P. This seems even more reasonable if we consider that the WB-P contained a much higher number of proteins, which could have led to the loss of detection of some peptides due to instrument settings. With the due caution, we believe it is reasonable to leave our claim in the manuscript, supporting it by adding the Suppl. Fig.1.

    "three consecutive ShK domains with peculiar characteristics (Suppl. Fig. 2)" - what are these characteristics?*

    This has been better clarified in the text which now reads “Only the C-terminal domain has the typical ShKT cysteine pattern, whereas the first two domains present an unusual shift of the C-terminal cysteine. None of the domains of U-GRTX-Esi4 presents the key Lys residue necessary for binding KV1.2 and KV1.3, while the subsequent Tyr residue, also important for binding KV1, is extremely conserved”. The reference figure is now Suppl. Fig. 3.

    Fig. S1 legend: "Octocorallia (cyano bar) and Hexacorallia (blue bar)" - the bars look pink and cyan.*

    *The figure (now Suppl. Fig. 2) was modified in order to fix this issue. *

    *Referee cross-commenting

    I agree with both reviewers that additional validation of the ethanol extraction method would be required to confirm its specificity and efficiency. Since ethanol is widely used for tissue fixation, I would guess that it is improbable that it leads to disruption of other coral cell types in addition to discharging nematocytes. However, to be 100% sure that would need to be confirmed experimentally. I think the suggestion to use Xenia single cell dataset to validate the nematocyst proteome reported in this paper is really worth trying. However, toxin-like genes in cnidarians might be recruited to non-venom cell types (Sachkova et al, 2020; Surm et al 2024) therefore if a gene is nematocyte-specific in one species it does not mean it would the same in another one, especially if they are distantly related. Thus, the best would be to run some additional experiments in Eunicella singularis, if the tissue is available.

    We have received this concern and addressed it by rephrasing the text. We have also performed the requested check with Xenia nematocysts single cell data set. In detail, we recovered 243 high-confidence single-copy orthologs conserved between Xenia and E. singularis, which were described as belonging to cluster 11, associated to nematocytes by Hu and colleagues in their 2020 Nature article. We comparatively evaluated the abundance of the peptide fragments that could be mapped to the corresponding de novo assembled contigs in E. singularis whole-body and nematocyst proteomes, finding very little overlap, both with the whole-body, and with the nematocyst proteome. In detail, we found none of the sequences shared with Xenia cluster 11 in the NEM-P, while 16 sequences were retrieved in the WB-P. None of the latter corresponded to toxins, but rather possessed PFAM domains indicative of housekeeping functions.

    We believe that these observations are not surprising, due to the following reasons:

    (i) as we show in Figure 6, Xenia appears to display a highly divergent venom arsenal not just from Eunicella singularis, but also from all other Octocorallia. Consequently, we can hardly expect any of the main molecular components of the venom to display a 1:1 orthology between the two species. In addition, Xenia is a zooxanthellate species, obtaining most of its energy autotrophically and complementing with the absorption of particulated organic matter. Due to its trophic ecology, we do not expect this species to produce predatory venom.

    (ii) although Xenia cluster 11 includes genes specifically expressed in the nematocysts, these do not necessarily encode venom components but also other cellular components from the nematocytes. In contrast, if successful, our approach would yield a fraction enriched in secretory products while other intracellular or membrane-bound proteins that are specifically expressed by nematocytes, are not expected to be particularly enriched in the NEM-P.

    In addition, due to the remarkable divergence between these two species, not all Xenia nematocyte-specific transcripts are expected to retain the same specificity also in Eunicella.

    Reviewer #1 (Significance (Required)):

    This study reports venom composition of an octocoral for the first time. These data are very important for understanding biology and ecology of these animals as they rely on venom for feeding and deterring predators. This study is a significant advancement of the cnidarian venomics as most of the literature is limited to sea anemone and jellyfish venoms. This study will be interesting to the broad audience: venomics and coral ecology communities, evolutionary biologists and marine scientists. The main strength of this work is that it provides a comprehensive overview of the venom system in a widespread octocoral species with important ecological roles. The limitations of this study is that the toxicity and biological function of the identified venom components have not been confirmed experimentally. However, the localisation of the proteins to nematocysts is a very strong indication of being a venom component.* *My expertise: cnidarian venom (biochemistry, ecology and evolution).

    *Reviewer #2 (Evidence, reproducibility and clarity (Required)):

    Summary:* *The authors of this work explore the venom repertoire of octocoral, a group of cnidarians whose venom has largely been ignored in the literature. As a first step into characterizing the venom of octocorals, the authors use a proteo-transcriptomic approach for Eunicella singularis, Specifically, they generated the transcriptome and proteome from whole-body as well as a more specific proteome of the nematocyst, a specialized sub-cellular structure found only in cnidarians and used to inject venom. The nematocyst proteome is a crucial dataset of the manuscript as it allows the authors to discriminate what is most likely a bona fide toxin compared to general physiological proteins.

    Major: *However, I have some skepticism regarding the legitimacy of this nematocyst proteome. Specifically, the proteins from this are nematocyst-specific. The authors used an approach to soak the animal in ethanol, which theoretically should cause the nematocyst to fire, releasing the venom housed inside. This is a technique previously used in box jellyfish where they show that indeed the nematocyst have fired using histological approaches. However, this was not validated for Eunicella singularis. I am hesitant to fully accept that the data from the nematocyst-proteome is specific. Other approaches, such as isolating nematocyst using a percoll gradient, will likely generate a more specific nematocyst proteome. This percoll gradient approach has been used to isolate nematocysts from different species of cnidarians ranging from hydra to sea anemones, however, I recognize that although this approach is robust for different cnidarians, acquiring enough material is challenging and maybe beyond the capacity for this octocoral. I would argue this would be the best approach, but if not feasible I can understand. However, other potential validation could be used to help improve the confidence that this is, at least mostly, nematocyst-specific. Furthermore, one could argue that this ethanol approach used in box jellyfish also specifically used tentacle, a tissue significantly enriched in nematocyst likely greatly improving the specificity in isolating nematocyst-specific proteins. whereas in this study they use a collection of whole polyps, therefore, anything that is extracted from the ethanol would precipitate. This is a much more complex collection of tissues which I would assume could interfere with isolating nematocyst-specific proteins

    We thank the Reviewer for these comments. It is indeed true that there are cleaner procedures to extract venom from nematocysts. Preliminary attempts with electrical stimulation of colonies to milk the venom were also performed, but did not yield satisfactory peptide amounts for further analysis. We then decided to attempt ethanol extraction. As also noted by Reviewer #1, ethanol is routinely used for tissue fixation, and we think that it could have only limited effect on other cell types, therefore we assumed that most proteins in this extract had to come from nematocysts firing. While we cannot be sure that we fired all kind of nematocysts from E. singularis, the enrichment of the NEM-P in proteins with typical toxin features (i.e. signal peptide, small size, elaborate cysteines patterns), represented an indirect proof of this hypothesis. We believe this NEM-P may represent a good snapshot of venom components from E. singularis. On the other hand, it is true that the ethanol procedure may introduce some contamination. Indeed, we adopted a conservative approach and discussed in detail only the proteins with toxin-like features. At any rate, we have clearly stated the methodological limitations of our approach in the text and added cautionary statements through the manuscript.

    *A computational approach, that I think is essential, is to use the Xenia single-cell atlas. Xenia is also an octocoral with a nice single-cell atlas in which the cnidocytes form a distinct cluster. The authors can perform a reciprocal best-blast hit with the xenia genome and Eunicella singularis transcriptome and then see if gene-encoding proteins found in Eunicella nematocyst proteome have orthologs with genes found in the Xenia cnidocyte cluster. A statistical test could then be performed to show that there is a significant overlap between the nematocyst proteins from Eunicella and their orthologs in the Xenia cnidocyte cluster. This is still quite indirect but can give some insights. A better approach would be to perform proteomics from Xenia using the ethanol approach and mapping to see where the proteins captured are found in the atlas. This would massively elevate this work and provide proof that indeed this approach using ethanol is capable of precipitating nematocyst-specific proteins. I would strongly recommend trying to provide some evidence that this is indeed a nematocyst-specific protein, or at the least, is significantly enriched. Because this is unknown, many of the interpretations presented downstream are not well supported.

    As previously stated in response to Reviewer #1, we have performed the requested check on Xenia nematocyte single cell data set. In detail, we followed the advice provided by the reviewer, extracting the protein sequences of the 432 Xenia genes included in cluster 11 from the work by Hu and colleagues, and recovered the nucleotide sequence of the assembled transcripts of 243 high-confidence 1:1 orthologs from E. singularis. In this process, we paid particular attention to excluding ambiguous matches, such as genes subjected to lineage-specific duplications, and therefore we exploited the availability of the annotated genome of the congeneric species E. verrucosa for the first step of orthology detection (performed through a reciprocal BLASTp approach). In the second step of the analysis, the corresponding assembled transcripts from E. singularis were identified with tBLASTn, assuming an inter-specific divergence This subset of putative nematocyst-specific sequences was subjected to an in-depth analysis, which comparatively evaluated the relative abundance of mapped peptide fragments in the whole-body and nematocyst proteomes. This process led to the identification of very little overlap between Xenia and E. singularis. We believe that these observations are not surprising, due to the following reasons:

    (i) as we show in Figure 6, Xenia appears to display a highly divergent venom arsenal not just from Eunicella singularis, but also from all other Octocorallia. Consequently, we can hardly expect any of the main molecular components of the venom to display a 1:1 orthology between the two species. In addition, Xenia is a zooxanthellate species, obtaining most of its energy autotrophically and complementing with the absorption of particulated organic matter. Due to its trophic ecology, we do not expect this species to produce predatory venom.

    (ii) although Xenia cluster 11 includes genes specifically expressed in the nematocysts, these do not necessarily encode venom components but also other cellular components from the nematocytes. In contrast, if successful, our approach would yield a fraction enriched in secretory products while other intracellular or membrane-bound proteins that are specifically expressed by nematocytes, are not expected to be particularly enriched in the NEM-P.

    In addition, due to the remarkable divergence between these two species, not all Xenia nematocyte-specific transcripts are expected to retain the same specificity also in Eunicella.

    Another major issue with the manuscript is the section referring to SCRiPs. First, the authors do not cite Jouiaei, Sunagar et al. (2015) which was the first publication to functionally characterize SCRiPs as toxins. Additionally, the majority of SCRiPs identified in this study and those found in Eunicella have a different cysteine framework. The authors acknowledge this online 245 but claim that, given the alphafold structure is similar, they are from the same gene family. First, I think this is very weak support as typically sharing a conserved cysteine framework is the bare minimum to categorize these toxins in a gene family. Although some cysteine frameworks are somewhat hard to resolve as the space between the cysteines can be variable, in this case, SCRiPs have a very distinct triple repeat of cysteines near the C terminal that is missing in these octocoral SCRiPs. These make me suspicious that these are indeed from the same gene family. Then relying on alphafold to predict the structure and claiming it's similar to Tau-AnmTx Ueq 12-1 from Urticina eques is also fairly weak support. Although I am not an expert in protein structures, I cannot tell from the images comparing the 2 structures in the supplementary figure s1 that these are similar. Perhaps you could align or overlap them, or give some readout of the similarity of these structures. Currently, I am skeptical of any of the SCRiPs described in this manuscript. Additionally, if the authors can show that indeed these are SCRiPs, again I would strongly advise the authors to check the Xenia scRNA-seq to see if these Xenia SCRiP-like sequences are expressed in cnidocytes.

    Given the concerns raised by the Reviewer, throughout the text we now referred to octocoral SCRiPs as SCRIP-like proteins or octo-SCRiPs. Reference to Jouiaei, Sunagar et al. (2015) was added. However, we would like to point out that we do not associate them to hexacoral SCRiPs based on their predicted structure similarity: the Suppl. Fig. 2 presents the alignment of the sequences of these proteins with representative sequences from Hexacorallia, highlighting a sequence similarity up to 68%. Considering the high level of sequence divergence generally recognized within toxin families, this high similarity value contributes to support our claims. Despite the relevance of the cys framework in defining toxin families, a single amino acid shift is not necessarily indicative of a new structural family.

    Concerning the structural comparison between SCRiPs and octo-SCRiPs, Suppl. Figure 2.B has been replaced with a superposition of the structure of AnmTx Ueq 12-1 with the model of U-GRTX-Esi1a. The structures were aligned with TM-align, resulting in a Cα RMSD for the aligned region of 1.86 Å, which confirms the strict similarity of the two proteins.

    Unfortunately, we need to rely on available genome annotations for the evaluation of the Xenia scRNA-seq data. The only currently annotated Xenia gene showing significant homology with the SCRiP-like of E. singularis (Xe_002907) has a highly different organization, as it shows five consecutive cysteine-rich domains, and is therefore not orthologous to any of the three sequences we report in the present work. In the paper by Hu and colleagues, Xe_002907 is associated to cluster 2, which was unrelated with nematocysts.

    Minor:

    *The ShK protein, U-GRTX-Esi4, strikes me as similar to NEP3 gene family identified in Nematostella, which also has 3 ShK domains (Columbus-Shenkar et al. 2018).

    We have added reference to the NEP3 family in the text and discussed the similarities of U-GRTX-Esi4 with its members, highlighting that while in NEP3 the mature toxin corresponds only to the first ShK domain, U-GRTX-Esi4 is supported as a multidomain protein by our proteomic analyses.

    Interestingly U-GRTX-Esi20 and 21 were found to be structurally similar to acrorhagin 1a but do not share a conserved cysteine framework ( 6 cysteines vs 8). One thing that the authors should be careful of, and perhaps point out that this is indeed not nematocyst-specific, is that an ortholog acrorhagin 1a was found to be expressed in the neurons in Nematostella (Sachkova et al. 2020). Perhaps ancestral acrorhagin 1 was found in the last common ancestor of Anthozoa but was a neuropeptide that got recruited to the venom in Actinia.

    Because of the methodology employed, we expected the NEM-P to be a toxin-enriched subset of the WB-P. Indeed, some of the toxin-like proteins detected in the NEM-P were not observed in the WB-P, where they might have been below the LOD during proteomic analysis. On the other hand, being a whole-body proteome, we expect the WB-P to contain ALSO nematocyst specific proteins. At present, the detection of U-GRTX-Esi20 and 21 in the WB-P does not rule out that these may be nematocyst specific, whereas their presence in the NEM-P, in our view, confirms their occurrence in the venom. At any rate, given the current level of evidence, this Reviewer is right in considering all possibilities, such as their neuropeptide nature. These considerations have been added to the text.

    Also in general the authors refer to a lot of phylogenetics that I cannot see in the paper. For example, on line 339: "Our genomic survey indicates that these two toxins belong to two distinct monophyletic orthogroups within a very large superfamily of cysteine-rich peptides, encoded by ancestrally duplicated paralogous genes with intronless structures, that also include other members in E. singularis, not detected in the NEM-P." *What genomic survey are you referring to (where is this data)? What do you mean by "belong to two distinct monophyletic orthogroups".

    In the attempt to keep the manuscript more concise, we concentrated comparative genomic analyses in the supplementary material. We now provide in the main text a detailed phylogenetic tree that displays the complex evolutionary relationships between U-GRTX-Esi20 and 21 and a number of other related sequences sharing significant sequence homology and predicted structural organization (Figure 6). In detail, the two Eunicella toxins belong to two groups of sequences, labeled as “type I” and “type VI” which are highly supported by robust bootstrap values (94 and 95, respectively) as monophyletic within Malacalcyonacea. Notably, we could identify four additional monophyletic groups, characterized by similar support values, that included sequences from both Eunicella and other Malacalcyonacea species (type II, III, IV and V). Nevertheless, these sequences were not identified as venom components by our proteomic analyses. Related proteins were also identified in species belonging to Scleralcyonacea, even though their precise relationships with those of Malacalcyonacea were often unclear.

    Also, there is no visualization of the results when the authors refer to the genomic surveys, especially when referring to intron-exon boundaries. Please include which genomes include which sequences and their given intron-exon boundaries for a given gene family.* *I do not understand how the authors resolved figure 4. How do you know there was a loss not a gain of f exon 2 in the gene encoding for U-GRTX-Esi17. Providing the genomic loci for the toxin gene families would help. Maybe something like figure 5 from Koludarov et al. (2024) would be useful, but ideally including intron-exon boundaries.

    The scenario we propose is far more parsimonious than the alternative hypothesis involving an intron gain, since this would have required an extremely complex combination of far less likely events, i.e. the independent acquisition of two partial colipase-like arrays in positions compatible with the generation of a complete colipase-like cysteine array. Despite being theoretically possible, we believe this scenario to be highly unlikely, also considering the well-established differences between the rates of intron gain and intron loss in eukaryotes, with the latter exceeding the former by several orders of magnitude (see Roy and Gilbert, 2005, https://doi.org/10.1073/pnas.0500383102).

    We present a supplementary figure which schematically displays the architecture of the genes encoding novel putative venom components described in this manuscript. We need to remark the fact that, as mentioned in the main text, no genome assembly is presently available for E. singularis, and therefore such gene architectures have been inferred from the congeneric species E. verrucosa. Despite being certainly interesting, the approach proposed by the reviewer referring to figure 5 from Koludarov et al., which would basically involve a microsynteny analysis for all loci, would go far beyond the aims and scopes of the present work and require an unreasonable workload, with a very marginal increase in the quality of the data we report. First and foremost, no genome assembly is available for our target species. Moreover, just a very few genomes of Octocorallia are associated with publicly available gene annotations (in detail, no gene annotation tracks are available for R. reniformis, P. caledonicum, V. gustaviana, P. papillata, Chrysogorgia sp., H. coerulea, P. subtilis, Trachytela sp. and M. muricata). The lack of existing annotations does de facto prevent the possibility of retrieving flanking genes and providing evolutionary insights at the level requested by the reviewer. We believe that the manual annotation of the target genes of interest in all analyzed species fully meets the objectives of this study.

    In the methods the author's mention:

    "Whenever needed (i.e., U-GRTX-Esi20 and 21), a fine-scale classification of orthologous sequences was aided by Maximum Likelihood phylogenetic inference analyses, carried out with IQ-Tree [49] with 1000 ultrafast bootstrap replicates based on the best-fitting model of molecular evolution detected by ModelFinder [50]."

    So please include this data as supplementary figures. The authors did plenty of analysis they refer to but do not include this in the paper. This lack of data makes it very hard to follow many of the phylogenetic and genomic insights from this manuscript.

    The phylogenetic tree which concerns U-GRTX-Esi20 and 21 has been added in the main text as Figure 6. In pretty much all other cases where we referred to comparative genomics analyses, our inferences were simply based on the detection (or lack thereof) of orthologous genes. Considering the narrow taxonomic distribution of most target sequences, which prevents the possibility of identifying suitable outgroups for tree rooting purposes, and their usual presence as single-copy genes in E. singularis, we don’t think that adding phylogenetic trees would add useful information to the manuscript. Nevertheless, we have added the multiple sequence alignments of all relevant groups of orthologous sequences as supplementary figures.

    *Reviewer #2 (Significance (Required)):

    *This work is very can be very useful in extending our knowledge of venom in cnidarians and can help build better resolution of the evolutionary history of the ecologically essential proteins

    *Reviewer #3 (Evidence, reproducibility and clarity (Required)):

    *SECTION A - Evidence, reproducibility and clarity

    =================================================

    Summary: *Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

    This manuscript describes the proteotranscriptomic analysis of samples from the coral Eunicella singularis. A number of putative venom toxins are identified. In silico structural analyses are performed for select putative toxins and inferred activity/function is discussed. In my opinion the subject of the study is important. However, I have some important questions about the methodology (regarding "venom collection" and assignment of "venom components"), and given the preliminary nature of the study I found some of the conclusions (regarding activity) somewhat overstated. *Major comments:

    • Are the key conclusions convincing?

    While some conclusions were justified, I felt unconvinced by others. Some of my pessimism stems from the technique used to extract the venom i.e. ethanol immersion. I'm not familiar with the use of this technique, however it strikes me as likely to be associated with some limitations. For example, while the nematocysts may indeed discharge their contents I would expect some contents e.g. larger proteins to be insoluble. Was this considered? This would have some major impacts on the conclusions drawn e.g. *(L418: "absence, in the NEM-P of E. singularis, of the common cnidarian cytolytic proteins." AND (L492): "conventional pore forming toxins (PFTs) of Cnidaria, including the aerolysin-like Δ-GRTX-Esi29 and the two actinoporins Δ-GRTX-Esi30 and 31 were not retrieved in the nematocysts' proteome."

    Because of this observation, the authors concluded that these were not venom components in this species and speculated on other functions. However, I can't help wondering if these were simply excluded from analysis as a result of the ethanol extraction i.e. a false negative.

    As anticipated in our response to Reviewer #1, we opted for ethanol extraction due to sample limitation and unsuccessful attempts with other venom collection protocols. The procedure we employed was first described by Jouiaei et al., 2015, to extract venom from the tentacles of Chironex fleckeri. Proteins and peptides extracted from the nematocysts were indeed precipitated from ethanol and subsequently resuspended for proteomic analysis. The original protocol by Jouiaei et al. used precipitation at -80°C to recover the proteins from ethanol. Albeit denaturing, this protocol should not imply sample losses. Large proteins that did precipitate were still resuspended and analyzed. We have introduced an evaporation/lyophilization step, which should not alter the outcome. In fact, we did detect higher molecular weight proteins in the NEM-P (mostly structural and enzymes). While denaturation and precipitation may functionally inactivate these proteins, these should all be detected by proteomics. The authors of the original paper presented a comparison between the venom obtained from ethanol extracted tentacles and the proteome of pressure disrupted purified nematocysts. In both cases, additional “non venom” and “structural” proteins were also detected (e.g. histones, filamin, ribosomal proteins, myosin, actin, collagen…). Given the prevalence of toxins or toxin-like proteins in our extract, we were reasonably convinced of the success of the extraction protocol. For sure, the method may present limitations: as also observed by Reviewer #1 and #3, contamination with non-nematocyst proteins is possible. This has also been considered. In fact, we adopted a conservative approach, choosing to discuss in detail only proteins with structural similarities with known toxins and/or typical toxin-like features. On the other hand, as noted by this Reviewer, our results may be partial, but, in our opinion, this would be most likely due to incomplete nematocysts firing rather than to sample loss. All these possibilities have now been better discussed and addressed in the text. At any rate, we are convinced that the protein diversification detected in the NEM-P is indicative of the presence of several venom components and provides a first indication of the existence of novel, octocoral-specific, venom protein families.

    Comparisons were made to other tissue samples (whole bodies). Were these samples prepared in the same way i.e. ethanol extraction? If not, the power of any comparisons would be limited.

    Following the described experimental approach, we expected the NEM-P to be a subset of the WB-P, for which no purification/enrichment of sort was performed. In fact, we reported both proteomes to confirm the enrichment of the NEM-P in venom components, highlighting the presence of putative toxins that might have been below the instrumental limit of detection in the more crowded whole body protein extract. At any rate, we have now modified the text, adding cautionary statements that may also explain our results.

    • *It was unclear to me exactly how "venom components" (Fig. 1A) were defined. Why are "enzymes" , "structural" and "unknow" NOT considered venom components when they were identified in the "venom" extract?

    The “structural” and “enzymes” categories were used to analyze the hits in the NEM-P. We decided to discuss only putative neurotoxins or cytolytic toxins based on the limited selectivity of the extraction protocol employed and on the lack of histological control. As structural components and enzymes, in the absence of a crude venom extract, may derive from other tissues, we preferred not to discuss them. We hope this is clearer in the amended version of the manuscript.

    Furthermore, a large proportion of proteins detected are "structural" - doesn't this suggest that the "venom" extract included a large proportion of false positives i.e. non-toxin proteins? Is it possible that some of the proteins which are considered as "venom components" are also false positives?

    • *As also noted by Reviewer #1, aside from contamination from other tissues, some of the toxin-like proteins we identified may have different functions (e.g, neuronal, developmental) and their toxin function is presumed on the basis of structural features. This issue is clearly addressed in the manuscript. Nonetheless, putative toxins are definitely enriched in the NEM-P compared to the WB-P, which leads us to believe that the NEM-P is a fraction enriched in nematocysts content. This is now more evident also in the PEAKS output files, provided as Supplementary Tables 2 and 3.

    The nematocyst ethanol extract is referred to throughout the manuscript as "venom". Similarly, what I would consider putative toxins are referred to throughout the manuscript as "toxins". Given the preliminary nature of the study I suggest the authors consider rewording these.

    This has been changed throughout the text.

    In short, the evidence presented left me unconvinced that the nematocyst ethanol extract that was analysed represented the genuine "venom" of this species and that the "toxins" identified represent the genuine toxin repertoire. The authors should at least discuss potential limitations, defend my claims in this context and adjust conclusions accordingly.

    We hope that the additional clarifications provided in the Results and Discussion section, and the amendments we made throughout the manuscript made our statements more convincing

    Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?* *See comment above regarding venom collection and conclusions drawn.

    We have introduced cautionary statements throughout the text.

    *Also, despite the absence of any experimental activity/functional data, there was a lot of inference about activity and function.

    A few examples: L299 - "might have acquired peculiar biological activity."

    L301 - "support their relevance for the predatory and/or defensive strategies…"

    L326 - "abundance of this protein suggests a strong functional relevance…"

    L358 - "the structure presented a SCRiP-like W-shaped fold, indicative of a potential neurotoxic function."

    L427 - "suggestive of a peculiar chemical selectivity towards different lipids"

    L506 - "the cytolytic activity seems to be ascribable mostly to the six saposins"

    *I suggest some removal or rewording throughout the Results/Discussion section to reflect the fact that most of this is purely speculative.

    This has been modified according to the reviewer’s suggestions.

    Regarding the following statement on L300 - "Notably, the transcripts for all these toxins had exceptionally high TPM values (1806, 569, 826 and 429, respectively for the U-GRTX-Esi14 to 17/18), which support their relevance for the predatory and/or defensive strategies of Eunicella singularis."* *These TPM values don't seem high to me e.g. 1806 TPM = 0.0018% of transcripts. How do these numbers compare to other "non-venom" components of the transcriptome? A graph illustrating this would be helpful.

    We thank the Reviewer for this suggestion. The expression values we report in this work were calculated based on an RNA-seq library generated from a whole body sample. Consequently, considering the low relative abundance of nematocysts to total body weight, we expect that the contribution of this cell type to the total extracted RNA to be rather low. We exploited the available information from a previously published single-cell RNA-seq dataset obtained from another octocoral species (i.e. Xenia, see Hu et al., 2020, Nature) to identify the most likely candidate nematocyst-specific mRNAs venom components having a 1:1 orthology relationship with E. singularis. In detail, we were able to detect high-confidence 1:1 orthologs for 242 out of the 432 Xenia genes included in cluster 11 in the study by Hu and colleagues (i.e. the cluster associated with nematocysts). This allowed us to assess the expression of the orthologous sequences, expected to share a similar cell-specificity, in E. singularis. The 242 putative nematocyst-specific mRNAs displayed an average expression level of 16.65 TPM (median = 4.85 TPM) in the whole body sample, and just 8 out of these (i.e. about 3% of the total) had an expression level higher than 100 TPM. Based on these observations, we believe that our statement that “all these toxins had exceptionally high TPM values” holds true. Supplementary table 2 reports the sequences of the toxins identified in the NEM-P together with the TPM of the corresponding transcripts.

    Regarding the following statement on L463 - "Our investigation unequivocally demonstrated that Octocorallia do produce venom" Was it not already known that Octocorallia have nematocysts and therefore are venomous (in which case this should be cited)? If this wasn't known, I don't think this study was really designed to test this hypothesis. Regardless, I don't think this is a meaningful claim to make here.

    This observation is correct. We have rephrased the text accordingly.

    Table S2: on what basis are the sequences highlighted in red considered "proteomics validated" e.g. confidence, coverage? Could a protein abundance column be included in this table (for NEM and WB tissues)?*

    Residues highlighted in red in Table S2 (now Suppl tab. 4) correspond to the tryptic peptides identified with good confidence by the LC-MS analysis. We have added supplementary files, as per request of Reviewer #1, with the summary of the PEAKS Studio outputs for the two proteomes, highlighting the confidence and coverage scores. In Suppl. Tab. 4, coverage has been recalculated considering the sequence of the predicted mature peptide (not the precursor identified by PEAKS Studio). Finally, as PEAKS Studio does not provide a quantitative measure of the identified peptides (i.e., counts), we have calculated and added to said tables the exponentially modified Protein Abundance Index (emPAI), which provides an approximate label free measure of each protein’s abundance. We have also added the relative emPAI, which normalizes each protein's emPAI value relative to the total emPAI of all proteins in the sample, providing a percentage abundance. It is noteworthy that all the proteins that have been identified as putative toxins have higher relative emPAI values in the NEM-P, thus providing yet an additional indirect proof of the validity of the ethanol extraction protocol (see Suppl. Tab. 2 and 3).

    • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. *Additional experiments e.g. synthesis and activity assays would go a long way towards bolstering some of the conclusions. However, if some of the conclusions can be toned down a little (see comments above), I don't consider these to be essential.

    In my opinion, the study would benefit from some additional analyses (described in the comments above).

    See our answers to the specific comments above.

    Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

    N/A*

    Are the data and the methods presented in such a way that they can be reproduced?

    Yes.*

    Are the experiments adequately replicated and statistical analysis adequate?* *No - I may be wrong, but as far as I can tell from the text, replicates were not collected. Three cDNA libraries were generated but were these replicates (please clarify this in the Methods)? It could be reasonably argued (and I would mostly agree) that replicates are not necessary for a general analysis of the composition of the samples. However in a couple of instances conclusions are drawn based on "differential expression". I suggest that in the absence of expression level replicates these conclusions should be withdrawn.

    The statements about differential expression (more correctly differential maturation) are based on proteomics results and not on DEG analysis in the transcriptome (see also reply to reviewer #1). All the claims have been rephrased and the supplementary figure 1 has been added to support our statements.

    Concerning the cDNA libraries, however, they were prepared as technical replicates to account for variations in venom expression among samples, and the resulting assemblies were pooled before assembly, as explained in the Methods section.

    • *"Abundance" of proteins or toxins was mentioned on occasion, but no data on quantification or abundance of proteins is mentioned anywhere (although this is something that could be done with the LC-MS/MS data). In my opinion these data would be very useful and should be included, especially if mentioned in the text.
    • *As previously discussed, we have calculated and added to the PEAKS output file the emPAI and the relative emPAI values. These data are now provided in the supplementary Tables 2 and 3.

    Minor comments:

    *Specific experimental issues that are easily addressable.

    Are there limitations to the ethanol extraction procedure (please add a paragraph in the Discussion)? Are there any previous studies using this procedure?

    This has been done: the potential drawbacks of the ethanol extraction procedure are now addressed in the Results and Discussion section.

    *Are prior studies referenced appropriately?

    Yes, for the most part (but see comment above).

    *Are the text and figures clear and accurate?

    In general yes, although I found myself looking for actual data. Most of the current figures are summaries or cartoons. I would have liked to have seen pictures of the species in question (including a picture/diagram of the tissue from which the cDNA libraries and proteomes were derived); a picture of the nematocysts; the total ion chromatogram of the "venom"; Some type of figure to place the "toxin" expression level in the context of all transcripts; some more of the actual sequences identified including alignments (in the main text rather than the SI);

    Various figures in the manuscript have been modified in accordance to the Reviewers’ suggestions. We have included a workflow of the extraction with a picture of E. singularis and modified Fig1 (now Fig 2) to include the TIC of the NEM-P.

    Figure 4: could the motifs and termini for each be labelled please.

    This has been done.

    Do you have suggestions that would help the authors improve the presentation of their data and conclusions?* *See comments above. In my opinion, the work done was quite preliminary (i.e. analysis of a single species and does not include any activity/functional data) but still significant and useful to the field. I felt that some of the conclusions were unnecessarily over-reaching and could be toned down without detracting from the importance of the manuscript.

    Several instances of hyperbole could be toned down e.g. use of the words: remarkable (L27); rich (L28); intricate (L38); significant (L189); peculiar (L299, 427); only (L191); exceptionally (L300); extremely (L316); strong (L326). Similarly, some wording is subjective e.g. "worthy of" (L33); "interestingly" (L220, 382, 426, 492, 535). Please amend.

    We have toned down our statements through the manuscript.

    "Homology" is used throughout when referring to similarity. Please change.

    This has been done

    Minor typos and similar:

    2.5 cm (L97) - use 25 mm (cm is not a standard scientific measure).

    30" (L97) - 30 min?

    ml (L97) - mL is technically correct although some journals use ml, regardless should be consistent throughout.* *Reverse-phase (L127) – reversed-phase

    30,000 (L141) – units?

    Typos were corrected.

    *Reviewer #3 (Significance (Required)):

    *SECTION B – Significance

    ========================

    *- Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

    *Cnidarian venoms and toxins have been the subject of extensive study over the past several decades. However there has been very little work performed on corals. In this respect, this subject of this manuscript is significant.

    *- Place the work in the context of the existing literature (provide references, where appropriate).

    *The subject of this manuscript i.e. the characterisation of the venom composition of a coral is an interesting topic. The work is rather preliminary, but still represents an important addition to the literature (without requiring overinterpretation of the results-see comments above).

    *- State what audience might be interested in and influenced by the reported findings.

    *I would expect the manuscript to be of interest to others working in the toxinology field, particularly those working on Cnidarian venoms or toxins.

    *- Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

    *Venom; Toxins; Pep

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #3

    Evidence, reproducibility and clarity

    Summary:

    Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate).

    This manuscript describes the proteotranscriptomic analysis of samples from the coral Eunicella singularis. A number of putative venom toxins are identified. In silico structural analyses are performed for select putative toxins and inferred activity/function is discussed. In my opinion the subject of the study is important. However, I have some important questions about the methodology (regarding "venom collection" and assignment of "venom components"), and given the preliminary nature of the study I found some of the conclusions (regarding activity) somewhat overstated.

    Major comments:

    • Are the key conclusions convincing?

    While some conclusions were justified, I felt unconvinced by others. Some of my pessimism stems from the technique used to extract the venom i.e. ethanol immersion. I'm not familiar with the use of this technique, however it strikes me as likely to be associated with some limitations. For example, while the nematocysts may indeed discharge their contents I would expect some contents e.g. larger proteins to be insoluble. Was this considered? This would have some major impacts on the conclusions drawn e.g. (L418: "absence, in the NEM-P of E. singularis, of the common cnidarian cytolytic proteins." AND (L492): "conventional pore forming toxins (PFTs) of Cnidaria, including the aerolysin-like Δ-GRTX-Esi29 and the two actinoporins Δ-GRTX-Esi30 and 31 were not retrieved in the nematocysts' proteome." Because of this observation, the authors concluded that these were not venom components in this species and speculated on other functions. However, I can't help wondering if these were simply excluded from analysis as a result of the ethanol extraction i.e. a false negative.

    Comparisons were made to other tissue samples (whole bodies). Were these samples prepared in the same way i.e. ethanol extraction? If not, the power of any comparisons would be limited.

    It was unclear to me exactly how "venom components" (Fig. 1A) were defined. Why are "enzymes" , "structural" and "unknow" NOT considered venom components when they were identified in the "venom" extract? Furthermore, a large proportion of proteins detected are "structural" - doesn't this suggest that the "venom" extract included a large proportion of false positives i.e. non-toxin proteins? Is it possible that some of the proteins which are considered as "venom components" are also false positives?

    The nematocyst ethanol extract is referred to throughout the manuscript as "venom". Similarly, what I would consider putative toxins are referred to throughout the manuscript as "toxins". Given the preliminary nature of the study I suggest the authors consider rewording these.

    In short, the evidence presented left me unconvinced that the nematocyst ethanol extract that was analysed represented the genuine "venom" of this species and that the "toxins" identified represent the genuine toxin repertoire. The authors should at least discuss potential limitations, defend my claims in this context and adjust conclusions accordingly.

    • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

    See comment above regarding venom collection and conclusions drawn.

    Also, despite the absence of any experimental activity/functional data, there was a lot of inference about activity and function. A few examples: L299 - "might have acquired peculiar biological activity." L301 ¬- "support their relevance for the predatory and/or defensive strategies..." L326 - "abundance of this protein suggests a strong functional relevance..." L358 - "the structure presented a SCRiP-like W-shaped fold, indicative of a potential neurotoxic function." L427 - "suggestive of a peculiar chemical selectivity towards different lipids" L506 - "the cytolytic activity seems to be ascribable mostly to the six saposins" I suggest some removal or rewording throughout the Results/Discussion section to reflect the fact that most of this is purely speculative.

    Regarding the following statement on L300 - "Notably, the transcripts for all these toxins had exceptionally high TPM values (1806, 569, 826 and 429, respectively for the U-GRTX-Esi14 to 17/18), which support their relevance for the predatory and/or defensive strategies of Eunicella singularis." These TPM values don't seem high to me e.g. 1806 TPM = 0.0018% of transcripts. How do these numbers compare to other "non-venom" components of the transcriptome? A graph illustrating this would be helpful.

    Regarding the following statement on L463 - "Our investigation unequivocally demonstrated that Octocorallia do produce venom" Was it not already known that Octocorallia have nematocysts and therefore are venomous (in which case this should be cited)? If this wasn't known, I don't think this study was really designed to test this hypothesis. Regardless, I don't think this is a meaningful claim to make here.

    Table S2: on what basis are the sequences highlighted in red considered "proteomics validated" e.g. confidence, coverage? Could a protein abundance column be included in this table (for NEM and WB tissues)?

    • Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

    Additional experiments e.g. synthesis and activity assays would go a long way towards bolstering some of the conclusions. However, if some of the conclusions can be toned down a little (see comments above), I don't consider these to be essential.

    In my opinion, the study would benefit from some additional analyses (described in the comments above).

    • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

    N/A

    • Are the data and the methods presented in such a way that they can be reproduced?

    Yes.

    • Are the experiments adequately replicated and statistical analysis adequate?

    No - I may be wrong, but as far as I can tell from the text, replicates were not collected. Three cDNA libraries were generated but were these replicates (please clarify this in the Methods)? It could be reasonably argued (and I would mostly agree) that replicates are not necessary for a general analysis of the composition of the samples. However in a couple of instances conclusions are drawn based on "differential expression". I suggest that in the absence of expression level replicates these conclusions should be withdrawn.

    "Abundance" of proteins or toxins was mentioned on occasion, but no data on quantification or abundance of proteins is mentioned anywhere (although this is something that could be done with the LC-MS/MS data). In my opinion these data would be very useful and should be included, especially if mentioned in the text.

    Minor comments:

    • Specific experimental issues that are easily addressable.

    Are there limitations to the ethanol extraction procedure (please add a paragraph in the Discussion)? Are there any previous studies using this procedure?

    • Are prior studies referenced appropriately?

    Yes, for the most part (but see comment above).

    • Are the text and figures clear and accurate?

    In general yes, although I found myself looking for actual data. Most of the current figures are summaries or cartoons. I would have liked to have seen pictures of the species in question (including a picture/diagram of the tissue from which the cDNA libraries and proteomes were derived); a picture of the nematocysts; the total ion chromatogram of the "venom"; Some type of figure to place the "toxin" expression level in the context of all transcripts; some more of the actual sequences identified including alignments (in the main text rather than the SI);

    Figure 4: could the motifs and termini for each be labelled please.

    • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

    See comments above. In my opinion, the work done was quite preliminary (i.e. analysis of a single species and does not include any activity/functional data) but still significant and useful to the field. I felt that some of the conclusions were unnecessarily over-reaching and could be toned down without detracting from the importance of the manuscript.

    Several instances of hyperbole could be toned down e.g. use of the words: remarkable (L27); rich (L28); intricate (L38); significant (L189); peculiar (L299, 427); only (L191); exceptionally (L300); extremely (L316); strong (L326). Similarly, some wording is subjective e.g. "worthy of" (L33); "interestingly" (L220, 382, 426, 492, 535). Please amend.

    "Homology" is used throughout when referring to similarity. Please change.

    Minor typos and similar:

    2.5 cm (L97) - use 25 mm (cm is not a standard scientific measure). 30" (L97) - 30 min? ml (L97) - mL is technically correct although some journals use ml, regardless should be consistent throughout. Reverse-phase (L127) - reversed-phase 30,000 (L141) - units?

    Significance

    • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

    Cnidarian venoms and toxins have been the subject of extensive study over the past several decades. However there has been very little work performed on corals. In this respect, this subject of this manuscript is significant.

    • Place the work in the context of the existing literature (provide references, where appropriate).

    The subject of this manuscript i.e. the characterisation of the venom composition of a coral is an interesting topic. The work is rather preliminary, but still represents an important addition to the literature (without requiring overinterpretation of the results-see comments above).

    • State what audience might be interested in and influenced by the reported findings.

    I would expect the manuscript to be of interest to others working in the toxinology field, particularly those working on Cnidarian venoms or toxins.

    • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

    Venom; Toxins; Peptides

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    Summary:

    The authors of this work explore the venom repertoire of octocoral, a group of cnidarians whose venom has largely been ignored in the literature. As a first step into characterizing the venom of octocorals, the authors use a proteo-transcriptomic approach for Eunicella singularis, Specifically, they generated the transcriptome and proteome from whole-body as well as a more specific proteome of the nematocyst, a specialized sub-cellular structure found only in cnidarians and used to inject venom. The nematocyst proteome is a crucial dataset of the manuscript as it allows the authors to discriminate what is most likely a bona fide toxin compared to general physiological proteins.

    Major:

    However, I have some skepticism regarding the legitimacy of this nematocyst proteome. Specifically, the proteins from this are nematocyst-specific. The authors used an approach to soak the animal in ethanol, which theoretically should cause the nematocyst to fire, releasing the venom housed inside. This is a technique previously used in box jellyfish where they show that indeed the nematocyst have fired using histological approaches. However, this was not validated for Eunicella singularis. I am hesitant to fully accept that the data from the nematocyst-proteome is specific. Other approaches, such as isolating nematocyst using a percoll gradient, will likely generate a more specific nematocyst proteome. This percoll gradient approach has been used to isolate nematocysts from different species of cnidarians ranging from hydra to sea anemones, however, I recognize that although this approach is robust for different cnidarians, acquiring enough material is challenging and maybe beyond the capacity for this octocoral. I would argue this would be the best approach, but if not feasible I can understand. However, other potential validation could be used to help improve the confidence that this is, at least mostly, nematocyst-specific. Furthermore, one could argue that this ethanol approach used in box jellyfish also specifically used tentacle, a tissue significantly enriched in nematocyst likely greatly improving the specificity in isolating nematocyst-specific proteins. whereas in this study they use a collection of whole polyps, therefore, anything that is extracted from the ethanol would precipitate. This is a much more complex collection of tissues which I would assume could interfere with isolating nematocyst-specific proteins

    A computational approach, that I think is essential, is to use the Xenia single-cell atlas. Xenia is also an octocoral with a nice single-cell atlas in which the cnidocytes form a distinct cluster. The authors can perform a reciprocal best-blast hit with the xenia genome and Eunicella singularis transcriptome and then see if gene-encoding proteins found in Eunicella nematocyst proteome have orthologs with genes found in the Xenia cnidocyte cluster. A statistical test could then be performed to show that there is a significant overlap between the nematocyst proteins from Eunicella and their orthologs in the Xenia cnidocyte cluster. This is still quite indirect but can give some insights. A better approach would be to perform proteomics from Xenia using the ethanol approach and mapping to see where the proteins captured are found in the atlas. This would massively elevate this work and provide proof that indeed this approach using ethanol is capable of precipitating nematocyst-specific proteins. I would strongly recommend trying to provide some evidence that this is indeed a nematocyst-specific protein, or at the least, is significantly enriched. Because this is unknown, many of the interpretations presented downstream are not well supported.

    Another major issue with the manuscript is the section referring to SCRiPs. First, the authors do not cite Jouiaei, Sunagar et al. (2015) which was the first publication to functionally characterize SCRiPs as toxins. Additionally, the majority of SCRiPs identified in this study and those found in Eunicella have a different cysteine framework. The authors acknowledge this online 245 but claim that, given the alphafold structure is similar, they are from the same gene family. First, I think this is very weak support as typically sharing a conserved cysteine framework is the bare minimum to categorize these toxins in a gene family. Although some cysteine frameworks are somewhat hard to resolve as the space between the cysteines can be variable, in this case, SCRiPs have a very distinct triple repeat of cysteines near the C terminal that is missing in these octocoral SCRiPs. These make me suspicious that these are indeed from the same gene family. Then relying on alphafold to predict the structure and claiming it's similar to Tau-AnmTx Ueq 12-1 from Urticina eques is also fairly weak support. Although I am not an expert in protein structures, I cannot tell from the images comparing the 2 structures in the supplementary figure s1 that these are similar. Perhaps you could align or overlap them, or give some readout of the similarity of these structures. Currently, I am skeptical of any of the SCRiPs described in this manuscript. Additionally, if the authors can show that indeed these are SCRiPs, again I would strongly advise the authors to check the Xenia scRNA-seq to see if these Xenia SCRiP-like sequences are expressed in cnidocytes.

    Minor:

    The ShK protein, U-GRTX-Esi4, strikes me as similar to NEP3 gene family identified in Nematostella, which also has 3 ShK domains (Columbus-Shenkar et al. 2018).

    Interestingly U-GRTX-Esi20 and 21 were found to be structurally similar to acrorhagin 1a but do not share a conserved cysteine framework ( 6 cysteines vs 8). One thing that the authors should be careful of, and perhaps point out that this is indeed not nematocyst-specific, is that an ortholog acrorhagin 1a was found to be expressed in the neurons in Nematostella (Sachkova et al. 2020). Perhaps ancestral acrorhagin 1 was found in the last common ancestor of Anthozoa but was a neuropeptide that got recruited to the venom in Actinia.

    Also in general the authors refer to a lot of phylogenetics that I cannot see in the paper. For example, on line 339:

    "Our genomic survey indicates that these two toxins belong to two distinct monophyletic orthogroups within a very large superfamily of cysteine-rich peptides, encoded by ancestrally duplicated paralogous genes with intronless structures, that also include other members in E. singularis, not detected in the NEM-P."

    What genomic survey are you referring to (where is this data)? What do you mean by "belong to two distinct monophyletic orthogroups".

    Also, there is no visualization of the results when the authors refer to the genomic surveys, especially when referring to intron-exon boundaries. Please include which genomes include which sequences and their given intron-exon boundaries for a given gene family. I do not understand how the authors resolved figure 4. How do you know there was a loss not a gain of f exon 2 in the gene encoding for U-GRTX-Esi17. Providing the genomic loci for the toxin gene families would help. Maybe something like figure 5 from Koludarov et al. (2024) would be useful, but ideally including intron-exon boundaries.

    In the methods the author's mention:

    "Whenever needed (i.e., U-GRTX-Esi20 and 21), a fine-scale classification of orthologous sequences was aided by Maximum Likelihood phylogenetic inference analyses, carried out with IQ-Tree [49] with 1000 ultrafast bootstrap replicates based on the best-fitting model of molecular evolution detected by ModelFinder [50]."

    So please include this data as supplementary figures. The authors did plenty of analysis they refer to but do not include this in the paper. This lack of data makes it very hard to follow many of the phylogenetic and genomic insights from this manuscript

    Significance

    This work is very can be very useful in extending our knowledge of venom in cnidarians and can help build better resolution of the evolutionary history of the ecologically essential proteins

  4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    Summary:

    The manuscript by Modica et al reports characterisation of the venom system in the white sea fan Eunicella singularis, a species of an octocorallian coral. E. singularis is common in the north-western Mediterranean sea. The authors used a proteo-transcriptomic approach followed by extensive bioinformatics analysis. Specifically, they generated a new E. singularis transcriptome and characterised extracts from nematocysyts (venom-bearing structures) and whole body using tandem mass spectrometry. Toxins were identified by HMMER using Tox-prot and VenomZone databases as queries as well as ClanTox web server.

    Major comments:

    1. As far as I am aware, venom production by ectodermal gland cells has been reported only in sea anemones (Moran et al, 2011), therefore it is unclear whether it is the case in the octocorallian sea fan as well. Additionally, cnidarian toxin-like proteins might be produced by neurons (Sachkova et al, 2020) or involved in development (Surm et al 2024). Thus, it is probable that in E. singularis not all the toxin-like proteins found in the whole body proteome and missing from the nematocyst proteome are venom components. Thus, additional experiments would be required to localise those proteins to ectodermal gland cells. I suggest to mention this limitation and refer to such proteins as "toxin-like" or "putative toxins".
    2. In addition to submitting proteomics data to PRIDE, it would be helpful for readers/reviewers to provide a supplementary excel file with all the peptides and proteins identified by PEAKS Studio. I could not access the data on PRIDE as I think they still have not been assigned a PXD dataset identifier.

    Minor comments:

    1. It would be helpful for readers to split the Results and Discussions into smaller subsections with headings, perhaps according to the identified toxin families. It would be also helpful to provide a summary figure with all the toxins identified and perhaps toxin expression levels. Especially showing cysteine patterns for new toxins would be very useful.
    2. It is unclear why the Toxin annotation pipeline is hidden in the supplementary material. It would be also helpful to show it as a schematic pipeline in the main text.
    3. The identification of proteolytic cleavage sites is not really described. It would be also helpful to mark them at the Figure 2.
    4. "Other peptides present in E. singularis nematocysts and displaying protease inhibitory domains, but likely lacking a toxin function (Kazal-type, cystatines, antistasins, and macins)..." - why do they likely lack a toxin function? what is the rational behind this statement?
    5. "cell- or tissue-specific differential maturation patterns" - I think the differential maturation needs to be confirmed by additional experiments to exclude a possibility of being an artifact due to low mass spectrometry sensitivity.
    6. "three consecutive ShK domains with peculiar characteristics (Suppl. Fig. 2)" - what are these characteristics?
    7. Fig. S1 legend: "Octocorallia (cyano bar) and Hexacorallia (blue bar)" - the bars look pink and cyan.

    Referee cross-commenting

    I agree with both reviewers that additional validation of the ethanol extraction method would be required to confirm its specificity and efficiency. Since ethanol is widely used for tissue fixation, I would guess that it is improbable that it leads to disruption of other coral cell types in addition to discharging nematocytes. However, to be 100% sure that would need to be confirmed experimentally. I think the suggestion to use Xenia single cell dataset to validate the nematocyst proteome reported in this paper is really worth trying. However, toxin-like genes in cnidarians might be recruited to non-venom cell types (Sachkova et al, 2020; Surm et al 2024) therefore if a gene is nematocyte-specific in one species it does not mean it would the same in another one, especially if they are distantly related. Thus, the best would be to run some additional experiments in Eunicella singularis, if the tissue is available.

    Significance

    This study reports venom composition of an octocoral for the first time. These data are very important for understanding biology and ecology of these animals as they rely on venom for feeding and deterring predators. This study is a significant advancement of the cnidarian venomics as most of the literature is limited to sea anemone and jellyfish venoms. This study will be interesting to the broad audience: venomics and coral ecology communities, evolutionary biologists and marine scientists. The main strength of this work is that it provides a comprehensive overview of the venom system in a widespread octocoral species with important ecological roles. The limitations of this study is that the toxicity and biological function of the identified venom components have not been confirmed experimentally. However, the localisation of the proteins to nematocysts is a very strong indication of being a venom component.

    My expertise: cnidarian venom (biochemistry, ecology and evolution).