Recurrent Evolutionary Innovations in Rodent and Primate Schlafen Genes

This article has been Reviewed by the following groups

Read the full article

Listed in

Log in to save this article

Abstract

SCHLAFEN proteins are a large family of RNase-related enzymes carrying essential immune and developmental functions. Despite these important roles, Schlafen genes display varying degrees of evolutionary conservation in mammals. While this appears to influence their molecular activities, a detailed understanding of these evolutionary innovations is still lacking. Here, we used in depth phylogenomic approaches to characterize the evolutionary trajectories and selective forces shaping mammalian Schlafen genes. We traced lineage-specific Schlafen amplifications and found that recent duplicates evolved under distinct selective forces, supporting repeated sub-functionalization cycles. Codon-level natural selection analyses in primates and rodents, identified recurrent positive selection over Schlafen protein domains engaged in viral interactions. Combining crystal structures with machine learning predictions, we discovered a novel class of rapidly evolving residues enriched at the contact interface of SCHLAFEN protein dimers. Our results suggest that inter Schlafen compatibilities are under strong selective pressures and are likely to impact their molecular functions. We posit that cycles of genetic conflicts with pathogens and between paralogs drove Schlafens’ recurrent evolutionary innovations in mammals.

Article activity feed

  1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    The authors do not wish to provide a response at this time

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #3

    Evidence, reproducibility and clarity

    This study by Mordier and colleagues represents an in depth analysis to clarify the evolutionary history and processes of the rapidly evolving Schlafen gene family with a strong focus on primates and rodents.

    The study is of high quality in my opinion, though I do have some minor comments:

    1. Fig 2 and Fig 4B present inferred phylogenetic trees of schalfens in primates and rodents - these trees appear to be unrooted or rooted on a single species rather than an outgroup/gene. I suggest that the authors consider whether an outgroup gene could be included or if an outgroup free approach could be used to estimate the position of the root. This is important because the use of an unrooted tree to make inferences on gene family evolution has important implications - for example, there are no clades in an unrooted tree (Wilkinson et al 2007, Trends Ecol Evol).
    2. Schlafen proteins beyond mammals are referred to as SLFN11, it is not clear why this is the case because they seem to be co-orthologous to all mammal schalfen groups (except SLFNL1) based on supplementary figure S2. In this context, perhaps this image should form part of the main text?
    3. For blast searches parameters should be included - what cutoffs were implied for similarity searches etc. Related to this on line 120-121 homology is described as 'significant'. Homology refers to an evolutionary relationship, sequence similarity may be significant or not based on the search performed but homology is qualitative and simply detectable or not.
    4. The first results section describes the results of phylogenetic analyses, however this section relies heavily on what might better be considered interpretation of these analyses, this is great and should be included but I suggest that the branching patterns in the trees and bootstrap values supporting relationships between genes are also reported in the text to link interpretations to actual results.
    5. Bustos 2009 included viral genes belonging to the family in their analyses and I think it may be pertinent to do so here also to determine if the results are consistent or not.
    6. Was a rate heterogeneity (e.g. gamma rates / +G) parameter considered in phylogenetic analyses or model testing, it is not reported here and very rare for this not to improve model fit and phylogenetic accuracy.
    7. The authors state that all data are available in public databases, but this is not the case for the results they generated. Making various file types produced in this study would be good - e.g. alignments, phylogenetic tree files, structures, etc.

    Significance

    This study is an important step forward in clarifying our understanding of schalfen evolution. I think the manuscript will be of interest to a number of research areas, including gene family evolution because of its focus on an unusually rapidly evolving gene cluster and to those working on the schalfen gene families functional importance in development and immunity. The results may also draw interest from those interested in the confluence of protein structure, function, and evolution. My expertise In the context of this study is in the phylogenetics and evolution of rapidly evolving gene families.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    In the current manuscript, Mordier et al. combine bioinformatic searches, synteny, and phylogenetic analysis to reconstruct the duplicative history of the Schlafen Genes in rodents and primates and then use molecular evolution analyses in combination with structural modeling to make inferences regarding the role of natural selection in the evolution of this gene family. The study represents an update on Bustos et al. (2009), who had already presented evidence that Positive Darwinian selection was likely a factor in the diversification of these genes in mammals. In this context, the contribution of this paper is the identification of sites that are candidates to be evolving under natural selection, and the structural exploration of the location of these sites in the proteins. CODEML strength lies in the detection of signatures of positive selection at the codon level, but it is not that accurate when it comes to pinpointing the actual sites that might be under selection. Hence, without experimental data, these inferences remain speculative. The manuscript is well-written and represents an update on the evolution of this gene family.

    Major Issues

    The rationale for the choice of species included in the analyses is never presented, and some of it is hard to understand. Why do authors exclude the platypus but include non-mammalian lobe-finned vertebrates is not clear. If they are going to discuss the evolution of these genes outside mammals, the authors need to survey a much wider array of genomes. Even within mammals, there is little discussion on why some species were included and others not. I think that focusing the study on rodents and primates is OK, but I also think that providing a strong justification of the selection of species to include in the study and a tree that justifies splitting the focus on rodents and primates would also be important.

    In the trees in Figures 2 and 4, several genes considered as orthologs are not in monophyletic groups. These pattern aligns well with the birth-and-death model of gene family evolution, and has implications for their molecular evolution analyses. The authors need to address this issue explicitly. I would use topology tests to evaluate whether these deviations from the expected topology are significant. In addition, the relevant tests to report here are M8 vs M7 and M8 vs M8a. The M0 vs M1a comparison does not provide evidence for positive Darwinian selection. If the M8 vs M7 and M8 vs M8a tests are not significant, the inferences about sites evolving with dN/dS>1 are not really valid.

    CODEML can implements models that are designed to test patterns of gene family evolution, contrasting pre and post duplication branches, which I think would be of value in this family.

    Some analyses are described very succinctly, which would make replication challenging.

    Minor Issues

    Could 2R be responsible for the emergence of SLFN and SLFNL1?

    There are several minor issues authors should fix in a revised manuscript. In general, because results are presented before the materials and methods, I think it is easier for readers to have some of the information in the results section.

    They need to be consistent in using italics for species names as well as for capitalization.

    In the Alignment and maximum-likelihood phylogenies section the authors indicate that they used either Muscle or Mafft for the alignments. What was the rationale for picking one alignment over the other for a given gene? In this section, they also indicate the selected a best-fitting model of substitution using SMS, but then indicate that they used JTT for protein alignments and HKY for nucleotide alignments.

    How did the authors ensure that nucleotide alignments remained in frame?

    Significance

    I think this is a significant contribution to our understanding of the evolution of the Schlafen gene family. There are two key contributions here: the demonstration that gene conversion is a factor obscuring relationships among genes in this gene family, and the mapping of amino acids inferred be evolving under positive selection to structurally important residues of the proteins. These residues should be of interest for functional assays that evaluate the functional role of these proteins.

  4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    Mordier et al. used in-depth phylogenomic methods to analyze the evolution of the mammalian Schlafen gene family. They identified a novel orphan Schlafen-related gene that arose in jawed vertebrates, and they assigned orthology between Schlafen cluster paralogs. This will allow for further accurate selection studies. Throughout the entire manuscript, the authors use nomenclature predating structural and biochemical studies. The nomenclature is purely based on sequence similarities, which are sometimes very weak and not convincing, and not based on the known function of the protein. In my opinion, this causes confusion and does not help scientists in the field. Especially in Figure 3, I wouldn't call it RNAse E (AlbA); instead, tRNA recognition site,endoribonuclease domain, SLFN core domain are the correct domain designations. Since SLFN11 is not a GTPase, why do the authors name the domain GTPase domain? Actually, the SWADL domain comprises a SWAVDL instead of a SWADL sequence motif. Hence, I would name the domain SWAVDL domain instead of SWADL domain, which is, in my opinion, misleading and was wrongly chosen in initial publications.

    In e.g. Figure 3 SLFN11 structure it would be better if the authors illustrated the important residues concerning the known RNase active site and ssDNA binding site. Further, a close-up of the SLFN11 interface with labeled amino acids involved in the interaction and highlighting the residues undergoing positive selection would help understand the evolutionary adaptation.

    Although, according to Metzner et al., the SLFN11 dimer is built up by two interfaces (I and II), where Interface I is situated in the C-terminal helicase domain and Interface II in the N-terminal SLFN11 core domain. It would be helpful for the reader if the authors stuck to this already introduced and widely accepted nomenclature in the field.

    In addition to the antiviral function, SLFN11 expression levels have been reported to show a strong positive correlation with the sensitivity of tumor cells to DNA damaging agents (DDAs). Hence, SLFN11 can serve as a biomarker to predict the response to, e.g., platinum-based drugs. It was revealed that SLFN11 exerts its function by direct recruitment to sites of DNA damage and stalled replication forks in response to replication stress induced by DDAs. Could the authors include this different molecular function of SLFN11 in their discussion of SLFN11s evolution and positive selection?

    Even though it seems unclear from the genetic and evolutionary aspect (Figure 4), mouse Slfn8 and Slfn9 complement human cells lacking SLFN11 during the replication stress response and seem to resemble the function of SLFN11 (Alvi et al. 2023). The authors of this study claim that Slfn8/9 genes may share an orthologous function with SLFN11. Could the authors comment on that discrepancy?

    Significance

    In general, the work is well conducted and provides valuable new insights in an important and growing field of research. However, there are some limitations to the study including the disregard of known protein function (e.g. SLFN11) and the usage of a purely sequence similarity based nomenclature.