In haploid budding yeast, evolutionary adaptation to constitutive DNA replication stress alters three genome maintenance modules: DNA replication, the DNA damage checkpoint, and sister chromatid cohesion. We asked how these trajectories depend on genomic features by comparing the adaptation in three strains: haploids, diploids, and recombination deficient haploids. In all three, adaptation happens within 1000 generations at rates that are correlated with the initial fitness defect of the ancestors. Mutations in individual genes are selected at different frequencies in populations with different genomic features, but the benefits these mutations confer are similar in the three strains, and combinations of these mutations reproduce the fitness gains of evolved populations. Despite the differences in the selected mutations, adaptation targets the same three functional modules in strains with different genomic features, revealing a common evolutionary response to constitutive DNA replication stress.

    We thank Review Commons and its three reviewers. Reviewers 2 and 3 provide detailed comments, which we address individually. Reviewer 1, however, gives a general critique of how we have approached asking how genome architecture affects the extent of evolution and the details of evolutionary trajectories. Our interpretation of their comments is that our approach and the one that they advocate represent two philosophically different, but complementary, views about how to study evolution in the laboratory. We begin by discussing this difference and then proceed to a point by point response to the three reviews.

    __Reviewer 1 __

    Philosophical differences with Reviewer 1

    We interpret Reviewer 1’s comments as endorsing a formal, quantitative study of evolution that aims to explain the factors that control the rate at which fitness increases during experimental evolution. This approach derives from classical population genetics and aims to use a mixture of theory and experiment to uncover general principles that would allow rates of evolution and evolutionary trajectories, expressed as population fitness over time, to be predicted from quantitative parameters, such population sizes, mutation rates, distributions of the fitness effects of mutations (including their degree of dominance in diploids), and global descriptions of either general (e.g. diminishing returns) or allele-specific epistasis.

    This approach aims to predict how the average fitness trajectory should be affected by variations in these parameters and describe the variation, at the level of fitness, in the outcomes in a set of parallel experiments. This is an important approach and have previously used it to investigate how the strength of selection influences the advantage of mutators (Thompson, Desai, & Murray, 2006) and to produce and test theory that predicts how mutation rate and population size control the rate of evolution (Desai, Fisher, & Murray, 2007). Like every approach to evolution, this one has limitations: 1) if it doesn’t identify mutations or investigate phenotype other than fitness, it cannot reveal the biological and biochemical basis of adaptation or report on how variations in population genetic parameters (population size, haploids versus diploids, etc.) influence which genes acquire adaptive mutations, and 2) if the details of experiments (e.g. whether populations are clonal or contain standing variation, or which phenotypes are being selected for) have strong effects on the population genetic parameters, these must be measured before theoretical or empirical relationships could be used to predict the mean and variance of fitness trajectories produced by a given selection. A variety of evidence suggests that the second limitation is real. Examples include the absence of a universal finding that diploid populations evolve more slowly than haploids (discussed on Lines 437-442), even within the same experimental organism, and the finding that diminishing returns epistasis applies well to domesticated yeast evolving in a variety of laboratory environments (e. g. papers from the Desai lab, starting with (Kryazhimskiy, Rice, Jerison, & Desai, 2014) but not to the evolutionary repair experiments that we have conducted (Fumasoni & Murray, 2020; Hsieh, Makrantoni, Robertson, Marston, & Murray, 2020; Laan, Koschwanez, & Murray, 2015).

    The second approach to experimental evolution, which we, as molecular geneticists and cell biologists, predominantly take, is to follow the molecular and cell biological details of how organisms adapt to selective pressure. We subject organisms to defined selective forces, identify candidate causative mutations, test them by reconstructing the evolved mutations, individually and in combination, and perform additional experiments to ask how these mutations are increasing fitness. Because these experiments are performed on model organisms and often address phenotypes that have been studied by classical and molecular genetics, we can often say a good deal about the cell biological and biochemical mechanisms that increase fitness and this work can complement and extend what we know from classical and molecular genetics.

    The current manuscript and its predecessor are examples of finding causative mutations and asking how they improve fitness, with the first paper (Fumasoni & Murray, 2020) demonstrating how mutations in three functional modules could overcome most of the fitness cost of removing an important but non-essential protein and the current paper asking how alterations in genome architecture and dynamics (diploidy and eliminating double-strand break-dependent recombination) affect the extent to which populations increase in fitness and which genes and functional modules acquire mutations as they do so.

    By definition, such experiments are anecdotal: they report on how particular genotypes and genome architectures respond to particular selection pressures. Any individual set of experiments can produce conclusions about the effects of variables, such a population size, mutation rate, and genome architecture, on the mutations that increased fitness in response to the specific selection, but they can do more than lead to speculation and inference about what would happen in other experiments: speculation from the results of a single project and inferences from the combined results of multiple projects. Our interpretation is that the evolutionary repair experiments that we have performed, which have perturbed budding, DNA replication, and the linkage between sister chromatids do indeed lead to a common set of inferences: most of the selected mutations reduce or eliminate the function of genes, the interactions between the selected mutations are primarily additive, and the mutations cluster in a few functional modules.

    We believe that the population and molecular genetic approaches to experimental evolution are complementary and that a full understanding of evolution will require combining both of them. We think this will be especially true as we try to use the findings from laboratory studies to improve our understanding of evolution outside the lab, which takes place over longer periods, in more temporally and spatially variable environments, and is subject to variation in multiple population genetic and biological parameters.

    Reviewer #1 (Evidence, reproducibility and clarity (Required)):

    In their previous work the authors examined adaptation in response to replication stress in haploid yeast, via experimental evolution of batch cultures followed by sequencing. Here they extend this approach to include diploid and recombination-deficient strains to explore the role of genome architecture in evolution under replication stress. On the whole, a common set of functional modules are found to evolve under all genetic architectures. The authors discuss the molecular details of adaptation and use their findings to speculate on the determinants of adaptation rate.

    **SECTION A - Evidence, reproducibility and clarity**

    Experimental evolution can reveal adaptive pathways, but there are some challenges when applying this approach to compare genetic backgrounds or environments. They key challenge is that adaptation potentially depends on both the rate of mutation and the nature of selection. Distinct adaptation patterns between groups could therefore reflect differential mutation, selection, or both. The authors allude to this dichotomy but have very limited data to address it. The closest effort is engineering putatively-adaptive variants into all genetic background including those where they did not arise; the fact that such variants remain beneficial suggests they did not arise in certain backgrounds because of a lower mutation rate, but this is a difficult issue to tackle quantitatively.

    We agree, wholeheartedly, that adaptation depends on the combination of mutation rates and the nature of selection and our goal was to ask how the molecular nature of adaptation depends on genome architecture when three different architectures are subjected to the same selection: constitutive replication stress caused by removing an important component of replisome. We used a haploid strain as a baseline and compared it to two other strains chosen to influence either the effect of mutations (a diploid, where fully recessive mutations that were beneficial in the haploid would become neutral) or the rate of mutations (a recombination-defective strain that would be unable to use ectopic recombination to amplify segments of the genome). In both cases, we expected to see effects that are closer to qualitative than quantitative: the absence of fully recessive mutations in evolved diploids and absence of segmental amplification in the recombination-deficient haploid. We see both effects and they then allow us to ask two other questions: 1) does influencing the effect of a class of mutation (diploids) or preventing a class of mutation (recombination defect) have a major effect on the rate of evolution, and 2) do these differences affect which modules adaptive mutations occur in. As far as we can tell, the answer is no to both questions. We use “as far as we can tell” because our experiments do have limitations. First, the recombination-defective strain has a higher point mutation rate making it impossible to tell how much this elevation, rather than any other factor, accounts for it showing a greater fitness increase than the recombination-proficient haploid. Unfortunately, to our knowledge, it’s impossible to abolish recombination without affecting mutation rates. Second, we only experimentally tested a subset of the inferred causative mutations meaning that for many genes, our assertion that they are adaptive is a statistical inference and their assignment to a particular functional module is based on prior literature rather than our own experiments. In response to this criticism, we have now rephrased some of our sentences (see below).

    From mutation accumulation experiments, where the influence of selection is minimized, there is evidence that genetic architecture affects the rate and spectrum of spontaneous mutations. In this experiment, the allele used to eliminate recombination, rad52, will also increase the mutation rate generally. The diploid strain is also likely to have a distinct mutational profile--as a null expectation diploids should have twice the mutation rate of haploids. Recent evidence indicates the mutation rate difference between haploid and diploid yeast might be less than two-fold, but that there are additional differences in the mutation spectrum, including rates of structural change. The context for this study is therefore three genetic architectures likely to differ in multiple dimensions of their mutation profiles, but mutation rates are not measured directly.

    The reviewer is correct that we did not explicitly measure mutation rates, although the frequency of synonymous mutations (Figure 3-S1B) is a proxy for the point mutation rate as long as the majority of these mutations are assumed to be neutral. By this measure, the mutation rates for ctf4∆ haploids and ct4∆/ctf4∆ diploids, expressed per haploid genome, are close to each other (1.94 for haploids and 1.37 for diploids) but different enough to return p = 0.044 by Welch’s test, whereas the mutation rate for the recombination-deficient, ctf4∆ rad52∆ haploid is 4 to 5-fold higher (7.03). In contrast, we can infer that the *ctf4∆ rad52∆ *strain has much lower rates of segmental aneuploidy produced by recombination: we see only one such event in this strain in contrast to 16 in the ctf4∆ haploid and 44 in the ctf4∆/ctf4∆ diploid (Supplementary table 4), even though the amplification of the cohesin loader gene, SCC2, confers similar benefits in all three strains.

    The nature of selection on haploids and diploids is expected to differ because of dominance, but ploidy-specific selection is also possible. The authors discuss how recessive beneficial alleles may be less available to diploids, though this can be offset by relatively rapid loss of heterozygosity. However, diploids should also incur more mutations, all else being equal. The rate of beneficial mutation, as opposed to the rate of mutation generally, will depend on the mutational "target size" of fitness, and the authors findings recapitulate other literature (particularly regarding "compensatory" adaptation) that points to faster adaptation in genotypes with lower starting fitness.

    We agree with the reviewer and tried to make the point that which mutations are fixed is primarily determined by the product of the rate at which they occur and the benefit which they confer (lines 193-196). Evidence in budding yeast suggests that in diploid cells, removing one copy of most genes fails to produce a measurable fitness benefit (Deutschbauer et al., 2005), suggesting that losing one copy of many genesis purely recessive. If this was always the case, it would be very hard for such heterozygous, loss-of-function mutations to contribute to evolution in diploids: a mutation that inactivates one copy of a gene would have to rise to high enough frequency by genetic drift that homozygosis of this mutation mitotic recombination would have a significant probability. Instead we find that heterozygous mutations in some genes (inactivation of RAD9, what are likely to be hypomorphic mutations in SLD5) but not others (inactivation of IXR1) confer benefits in diploids that allow their frequency to rise much more rapidly by selection than they would by drift, allowing them to reach frequencies at which mitotic recombination becomes probable.

    There is ample literature on the above topics, particularly discussions of the evolutionary advantages of haploidy versus diploidy. While adaptation to replication stress provides a novel starting point for this investigation, much of the manuscript is devoted to long-standing questions that are not specific to replication stress. Unfortunately, the data the authors collected is not sufficient to shed light on these questions, because mutation and selection cannot be effectively distinguished. The Discussion states that "We find that the genes that acquire adaptive mutations, the frequency at which they are mutated, and the frequency at which these mutations are selected all differ between architectures but that mutations that confer strong benefits always lie in the same three modules" (line 379), but it is not clear that these statements are all supported by the data.

    The reviewer makes two points: we fail to make a significant contribution to long-standing questions about the evolutionary genetics of adaptation and the we make statements that are not supported by our data. On the first we disagree: unlike much of the previous work which compares the effects of mutation rates and population sizes on the rates of evolution, we sequence genomes, identify putative causative mutations, verify that they increase fitness, and test, by reconstruction, how their contribution to fitness is affected by fully characterized genome architectures. We know of no comparable work and we believe that this is a useful contribution to understanding evolution. In addition, some of the literature, for example the discussion of haploidy versus diploidy, has failed to reach a universal conclusion. On the second point, we realized that the statement that the reviewer quotes is stronger than it should be since we do not show “that mutations that confer strong benefits always lie in the same three modules”. What we do show is that mutations in all three modules are found in all three genome architectures (Figure 5), and that combining one mutation from each module (using mutations in genes that are found in that architecture) can reproduce the observed fitness increase in each architecture (Figure 6 B), but the reviewer is correct that we have not demonstrated that every clone from every population has an adaptive mutation in all three modules. We have therefore modified the quoted sentence as follows (altered wording underlined)

    "We find that the genes that acquire adaptive mutations, the frequency at which they are mutated, and the frequency at which these mutations are selected all differ between architectures but that mutations conferring strong benefits can occur in all three modules in each architecture" (Lines 405-408)

    Focusing on the more novel aspect of their experiment-the presence of replication stress-would arguably be a better approach. On this topic the authors have some interesting observations and speculation, but clear predictions are lacking. The introduction section could be redesigned to explicitly state why genome architecture might affect adaptation in response to replication stress in particular, rather than (or in addition to) adaptation generally. If there were no differences in mutation, does the nature of Ctf4 lead to predictions that the molecular basis of compensatory adaptation should differ among genome architectures? Without such predictions it will be difficult for readers to know whether the observation that different genome architectures follow similar adaptive paths is surprising or not.

    We believe that following this suggestion would diminish the paper. We set out to ask how genome architecture affected adaptation to the strong fitness defect produced by removing an important component of an essential process, DNA replication. We chose replication stress as an example of cell biological damage that cells would have to repair with the hope that the results would give general clues about evolutionary repair, rather than hoping that the experiment would inform us about how replication stress altered the types of mutation (e. g. point mutations versus segmental amplification) that were selected As we point out at the beginning of our response, we recognize that the result of any one such experiment must be anecdotal and any attempt to generalize must be described as speculation if it refers only to this one experiment, or inference if it refers to this experiment and other published work. In those cases where we discuss the effect of genome architecture on evolutionary trajectories, we can draw conclusions that apply to our own experiments, but can only speculate on adaptation to different selections. In others, where we see commonalities between our experiments and previous work on evolutionary repair (cite Review), we can make inferences about evolution to adapt to removing important proteins and speculate about other forms of selection. We have revised the discussion to make it clear where we conclude, where we speculate, and where we infer. We suspect that our finding that genome architecture has a larger effect on which genes acquire adaptive mutations than it does on which modules these mutations alter will generalize to other evolutionary repair experiments and may be true even more broadly.

    We deliberately did not make predictions about the effect of genome architecture on the rate at which population fitness increased or the mechanism of adaptation to replication stress because we believed that our ignorance and the diverging results of previous experiments was sufficient to make both exercises worthless. After the fact, we interpret our results to suggest that mutations that reduce the activity of components, such as Sld5, that are stably associated with replication forks should be semi-dominant, but we were not nearly smart enough to make such a specific prediction before the experiment began!

    **Minor comments:**

    Shifts in ploidy from diploid to haploid are less common than the reverse change, so the observation of such a shift (Fig. 1) should be discussed in more detail.

    We now mention that haploids becoming diploids is more common than the reverse transformation and point out that genome sequencing reveals that these strains are true haploids rather than aneuploids.

    “One diploid population (EVO14) gave rise to a population with a haploid genome content, suggesting a possible haploidization event during evolution. Sequencing revealed no aneuploidies as a potential explanation of this phenomenon. While diploidization has been recurrently observed during experimental evolution with budding yeast (Aleeza C. Gerstein & Otto, 2011; Aleeza C Gerstein, Chun, Grant, & Otto, 2006; Harari, Ram, Rappoport, Hadany, & Kupiec, 2018; Venkataram et al., 2016), reports of spontaneous haploidization events have been instead scarce. Given the difficulties introduced by the change of ploidy over the 1000 generations, we have excluded EVO14 from all our analyses.” (Lines 122-128)

    We believe that the most likely mechanism is that the strain sporulated to produce haploids that were fitter than their diploid parent, but because this event occurred in only one out of eight populations and the proposed explanation is pure speculation we have not included in the revised manuscript.

    Line 88 typo 'stains'.

    Fixed. Thank you.

    Reviewer #1 (Significance (Required)):

    **SECTION B - Significance**

    The novel aspect of this study is the combination of replication stress and genome architecture, but here the significance is limited by a lack of clear predictions on how these factors might interact. On the other hand, much of the manuscript is devoted to why adaptation might vary among genome architectures in general, but this long-standing and important question is not particularly well resolved by this experimental approach, which can't disentangle mutation and selection.

    Our belief is that quantitatively predicting how selection will change fitness is nearly impossible because we lack the detailed knowledge of population genetic parameters that apply to our experiments. Prediction is even harder if the goal is to identify which genes will fix adaptive mutations and understand how these mutations alter cellular phenotypes to increase fitness. Thus our approach is almost entirely empirical: we do experiments that alter interesting variables, collect data, and do our best to interpret them and suggest how the conclusions of individual experiments might generalize.

    The authors highlight the dichotomy when discussing the evolution of ploidy: "We suggest that... genome architecture affects two aspects of the mutations that produce adaptation: the frequency at which they occur and the selective advantage they confer" (line 399), but presenting this as a novel inference does not appropriately acknowledge prior research and discussion of these ideas; several relevant papers are cited by the authors in other contexts. It may be possible to recast these findings as a test of the role of genome architecture in adaptation generally, but the authors should clarify the limitations of experimental evolution and more fully consider the theory and data outlined in previous research. In particular, few studies can claim to directly compare mutation rates between genome architectures, and it is not obvious that the present study is an example of such.

    We have the disadvantage that the reviewer doesn’t identify the literature we fail to cite. To us the argument the reviewer quotes is self-evident. As we mention above, our goal was not to test either general or detailed predictions and the level at which we analyzed our experiment, especially demonstrating that mutations were causal and reconstructing them individually and in combination, is missing from previous work. Finally measuring mutation rates is supremely difficult: you either need good ways of following all possible forms of mutation, quantitatively and without selection, or you resort to selecting mutations with a particular phenotype and molecularly characterizing them, knowing that these assays may well give different ratios of the rates of different types of mutation at different loci. We do make and report one measure of mutation rate, the rate of synonymous mutation in protein coding genes, which we discuss above.

    Reviewer expertise: Evoutionary genetics; experimental evolution; mutation.

    Reviewer #2 (Evidence, reproducibility and clarity (Required)):


    This manuscript investigates the effect of an organism's genotype (or, as the authors call it, an organism's 'genome architecture') on evolutionary trajectories. For this, the authors use Saccharomyces cerevisiae strains that experience some form of replication stress due to specific gene deletions, and that further differ in ploidy and/or the type of gene(s) deleted. They find the same three functional modules (DNA replication, DNA damage checkpoint, sister chromatid cohesion) are affected across the 3 different genotypes tested; although the specific genes that are mutated varies.

    **Major comments**

    This is a solid and exceptionally eloquent paper, comprising a large body of work that is in general well presented. That said, I do have some suggestions and questions. At several points in the manuscript, the authors should perhaps be more careful in their wording and avoid to overgeneralize data without providing additional evidence for these claims.

    We thank the reviewer for their constructive review and address their request for more careful wording below.

    • Some key points of the study are not entirely clear to me; possibly because the study builds upon a previous study that was recently published in eLife. Anyhow, I think it would be useful to clarify the following points a bit more:

    • Why exactly was ctf4∆ chosen as a model for replication stress? What is the evidence that ctf4∆ is a good model for replication stress? Without including some evidence for this, it is unclear how well the findings in this study really can be generalized to replication stress (which is what the authors do now).

    We described the reasons for choosing CTF4 deletion to mimic DNA replication stress in our previous eLife paper, to which we refer at. Nevertheless, the reviewer is right in asking us not to assume that the reader will have read our previous work. Briefly: DNA replication stress is a term that is loosely defined as the combination of the defects in DNA metabolism and the cellular response to these defects in cells whose replication has been substantially perturbed (Macheret & Halazonetis, 2015). Established methods in the field to induce DNA replication stress consist of either pharmacological treatments or genetic perturbation. Pharmacological treatments include hydroxyurea, which target the ribonucleotide reductase and hence stalls forks as a result of dNTP depletion (Crabbé et al., 2010), or aphidicolin, which directly inhibits polymerases α, ε and δ (Vesela, Chroma, Turi, & Mistrik, 2017b; Wilhelm et al., 2019). For genetic perturbation, the conditional depletion of replicative polymerases (Zheng, Zhang, Wu, Mieczkowski, & Petes, 2016) is frequently used. These methods are incompatible with experimental evolution, as cells can mutate the targets of replication inhibitors or alter the expression of genes that have been reduced in expression or activity. Removing an important but non-essential component of the replication machinery avoids these problems. We chose CTF4 deletion as a manipulation that affected the coordination of events at the replication fork: in the absence of Ctf4, the polα-primase complex is no longer physically bound to the replicative helicase, and thus the polymerase’s abundance at the replisome decreases (Tanaka et al., 2009). This manipulation achieves the same effects as polymerase depletion and replisome stalling, producing a constitutive DNA replication stress that can only be overcome by mutations in other genes. Multiple studies have shown that ctf4D cells display replication intermediates commonly associated to DNA replication stress, such as the accumulation of ssDNA gaps and reversed forks (Abe et al., 2018; Fumasoni, Zwicky, Vanoli, Lopes, & Branzei, 2015), fork stalling (Fumasoni & Murray, 2020), checkpoint activation (Poli et al., 2012; Tanaka et al., 2009) and altered chromosome metabolism (Kouprina et al., 1992).

    We now justify our choice of deleting CTF4 at line 74:

    “DNA replication stress is often induced with drugs or by reducing the level of DNA polymerases (Crabbé et al., 2010; Vesela, Chroma, Turi, & Mistrik, 2017a; Wilhelm et al., 2019; Zheng et al., 2016). To avoid evolving drug resistance or increased polymerase expression, which would rapidly overcome DNA replication stress,* we deleted the CTF4 gene, which encodes a non-essential subunit of the DNA replication machinery (the replisome) (Kouprina NYu, Pashina, Nikolaishwili, Tsouladze, & Larionov, 1988). Ctf4 is a homo-trimer that functions as a structural hub within the replisome (Villa et al., 2016; Yuan et al., 2019) by binding to the replicative DNA helicase, primase (the enzyme that makes the RNA primers that initiate DNA replication), and other accessory factors (Gambus et al., 2009; Samora et al., 2016; Simon et al., 2014; Villa et al., 2016). In the absence of Ctf4, the Pola-primase and other lagging strand processing factors are poorly recruited to the replisome (Samora et al., 2016; Tanaka et al., 2009; Villa et al., 2016), causing several characteristic features of DNA replication stress, such as accumulation of single strand DNA (ssDNA) gaps (Abe et al., 2018; Fumasoni et al., 2015), reversed and stalled forks (Fumasoni & Murray, 2020; Fumasoni et al., 2015), cell cycle checkpoint activation (Poli et al., 2012; Tanaka et al., 2009) and altered chromosome metabolism (Hanna, Kroll, Lundblad, & Spencer, 2001; Kouprina et al., 1992). As a consequence of these defects, ctf4D cells have substantially reduced reproductive fitness (Fumasoni & Murray, 2020).**”

    Would the authors expect to see similar routes of adaptation if a 'genomic architecture' with a less severe/other replication defect would have been used? I realize the last question is perhaps difficult to address without actually doing the experiment (which I am not suggesting the authors should do); I just want to point out that perhaps some data should not be over-generalized.

    We share the reviewer’s interest in asking whether different forms of DNA replication stress would lead to the same results described, and we plan to rigorously investigate this question in a separate paper. We note that the careful comparison between different forms of DNA replication stress has never been made and that authors studying this phenomenon often rely on a single perturbation to induce DNA replication stress (Crabbé et al., 2010; Wilhelm et al., 2019; Zheng et al., 2016). We agree that such a comparison will be useful, but we believe (as indicated by the reviewer) it will require an amount of work that goes beyond the scope of our study. To avoid over-generalization, we are using now using “a form of DNA replication stress” in lines 33, 244, 401, 414 and 461, to make it clear that our conclusions (as opposed to inferences and speculations) are restricted to the response to a single example of replication stress.

    Likewise, why was RAD52 selected as the gene to delete to affect homologous recombination? I understand that it is a key gene, but on the flipside, absence of RAD52 affects multiple cellular pathways and (as the authors also observe in their populations) also results in increased mutation rates which might confound some of the results.

    We aimed to observe the largest deficiency in DNA recombination possible and therefore chose to delete RAD52 because of its many roles in different forms of homologous recombination (Pâques & Haber, 1999) . The choice of other genes, such as *RAD51, *would have inhibited canonical double strand break (DSB) repair, but allowed other mechanisms that can rescue stalled replication forks (Ait Saada, Lambert, & Carr, 2018), such as break induced replication (BIR) or single strand annealing (SSA) (Ira & Haber, 2002).

    Our position regarding the inevitable increase in mutations rates obtained while working with genome maintenance process has been instead elaborated in response to reviewer #1 above.

    A sentence describing our choice to delete RAD52 has now been included at line 86:

    “…as well as from haploids impaired in homologous recombination due to the deletion of RAD52 (Figure 1A), which encodes a conserved enzyme required for pairing homologous DNA sequences during recombination (Pâques & Haber, 1999). Because Rad52 is involved in different forms of homologous recombination, it’s absence produces the most severe recombination defects and thus allows us to achieve the largest recombination defect achievable with a single gene deletion (Symington, 2002)..”

    Related to the first comment, it is also unclear to me how well the system chosen by the authors is representative of the replication stress experienced by tumor cells (as briefly touched upon in the final section of the discussion). Are some of the homologs key oncogenes that drive carcinogenesis?

    We should have been clearer. Our goal was to argue that the lesions and responses produced by replication stress in tumor cells, such as stalled replication forks and checkpoint activation, were similar to those seen in yeast cells lacking Ctf4. We did not mean to imply removing Ctf4 from yeast cells had the same effects on cell proliferation and survival as inactivating tumor suppressors and activating proto-oncogenes have in mammalian cells. Despite the difference between direct (removing Ctf4) and indirect effects on DNA replication (tumor cells), the replication intermediates (ssDNA, stalled and reversed forks), the cell cycle defects (G2/M delay), the genetic instability (increased mutagenesis and chromosome loss) and chromosome dynamics (late replication zones and chromosome bridges) generated by the absence of Ctf4 are similar to those observed in oncogene-induced DNA replication stress in mammalian cells (Kotsantis, Petermann, & Boulton, 2018). We therefore believe our experiments reveal evolutionary responses to a constitutive DNA replication stress that resembles the replication stress seen in cancer cells. Nevertheless, we agree that the comparison with cancer evolution remains speculative and we therefore avoided mentioning cancer in the title our paper or our conclusions, and only discuss it in a speculative section of the discussion.

    We have modified this section of the discussion as follows (line 554):

    “While generated through a different mechanism (unrestrained proliferation, rather than replisome perturbation), oncogene induced DNA replication stress produces cellular consequences (Kotsantis et al., 2018) which are remarkably similar to those seen in the absence of Ctf4, such as the accumulation of ssDNA, stalled and reversed forks (Abe et al., 2018; Fumasoni & Murray, 2020; Fumasoni et al., 2015), genetic instability (Fumasoni et al., 2015; Hanna et al., 2001; Kouprina et al., 1992) and DNA damage response activation (Poli et al., 2012; Tanaka et al., 2009). Based on these similarities we speculate that evolutionary adaptation to DNA replication stress could reduce its negative effects on cellular fitness and thus assist tumor evolution.”

    The authors should consider rephrasing some sentences regarding the occurrence of adaptive mutations. Sentences such as 'which genes are mutated depends on the selective advantage' (p1; lines 15-16); 'genome architecture controls the frequency at which mutations occur' (p15), "genome architecture controls which genes are mutated" (p1, line 20) makes it sound like the initial occurrence of mutations is not random, whereas in reality, the mutational landscape is the result of the combined effect of occurrence and fitness effect of the mutations, with the later rather than the former likely being the main driver behind the observed patterns.

    We thank the reviewer for asking for more precision in the above sentences, whose proposed changes we now list:

    *“Mutations in individual genes are selected at different frequencies in different architectures, but the benefits these mutations confer are similar in all three architectures, and combinations of these mutations reproduce the fitness gains of evolved populations.” (Lines 13-15) *

    “Genome architecture influences the distribution of adaptive mutants” (Line 277)

    "genome architecture influences the frequency at which mutations occur, the fitness benefit they confer, and the extent of overall adaptation." (Lines 462-463)

    Some important methodological information is missing or unclear in the manuscript:

    The authors should provide more details on how they decided which clones to select for sequencing. Did they select the biggest colonies; were colonies picked randomly, ...

    This following sentence is now reported in the materials and methods section (Line 603)

    “To capture the within-population genetic variability we selected the clones displaying the largest divergence of phenotypes in terms of resistance to genotoxic agents (methyl-methanesulfonate, hydroxyurea and camptothecin).”

    What is the population size during the evolution experiment?

    We now added the following sentence at line 599:

    *“In this regime, the effective population size is calculated as N0 x g where N0 is the size of the population bottleneck at transfer and g is the number of generations achieved during a batch growth cycle and corresponds to approximately to 107 cells.” *

    Sequencing of populations and clones: coverage should be mentioned

    The following sentence has now been added at line 616:

    “Clones and populations were sequenced at approximately the following depths: 25-30X for haploid clones, 50-60X for diploid clones, 50-60X for haploid populations and 120-130X for diploid populations.”

    Identification of mutations (p19, line 573): Is this really how the authors defined whether a variant is a mutation? Based on the definition given here, DNA mutations that lead to a synonymous mutation in the protein are not considered as mutations?

    We apologize for this typo. We do identify and consider synonymous mutations as evidenced by Figure 3-S1B. Now the sentence at line 626 correctly reports:

    “A variant that occurs between the ancestor and an evolved strain is labeled as a mutation if it either (1) causes a substitution in a coding sequence or (2) occurs in a regulatory region, defined as the 500 bp upstream and downstream of the coding sequence.”

    Perhaps the information can be found elsewhere, but the source data excel files for mutations is incomplete and should at the very least contain information on the type of mutation (eg. T->A), as well as the location of this mutation in the respective gene.

    Perhaps the reviewer is referring to Supplementary table 2, where we list the number of times a gene has been mutated in different populations (and thus summaries different types of mutations affecting the same gene). The information they request is reported in Supplementary table 1 for all the variants detected in populations and clones sequencing.

    **Minor comments**

    • While the author already cite several significant papers relevant for their manuscript, some other studies could also be included:

    We thank the reviewer for highlighting these references, which are now cited at line 28

    From the text in the abstract, it is unclear what the three genomic architectures (line 13) exactly are, the authors should consider spelling this out.

    In repose o the reviewer request for clarity we now propose the following change in line 13:

    *“We asked how these trajectories depend on a population’s genome architecture by comparing the adaptation of haploids to that diploids and recombination deficient haploids.” (Lines 9-11)

    Can the authors speculate on why a homozygous ctf4D/ctf4D rad52D/rad52D would be lethal, and a haploid not?

    See below

    The authors note that a diploid ctf4D/ctf4D strain is less fit than its haploid counterpart. Why do the authors think this is the case?

    In response to the two previous questions, we now propose the following speculations that we include in the text (Line 97):

    “Diploid cells require twice as many forks as haploids and Ctf4-deficient diploids are thus more likely to have forks that cause severe cell-cycle delays or cell lethality. We speculate that this increased probability explains the more prominent fitness defect displayed by diploid cells. Interestingly, homologs of Ctf4 are absent in prokaryotes, where the primase is physically linked to the replicative helicase (Lu, Ratnakar, Mohanty, & Bastia, 1996) and Ctf4 is essential in the cells of eukaryotes with larger genomes such as chickens (Abe et al., 2018) and humans (Yoshizawa-Sugata & Masai, 2009). Rad52 is likely involved in rescuing stalled replication forks by recombination-dependent mechanisms (Fumasoni et al., 2015; Yeeles, Poli, Marians, & Pasero, 2013). We speculate that the absence of Rad52 increases the duration of these stalls and leads some of them to become double-stranded breaks resulting in cell lethality and explaining the decreased fitness of ctf4D* rad52D haploid double mutants. In diploids ctf4D rad52D cells, which have twice as many chromosomes, the number of irreparably stalled fork may be sufficient to kill most of the cells in a population, thus explaining the unviability of the strain.”*

    The authors passage their cells for 100 cycles and assume that this corresponds to around 1000 generations for each population. However, the fitness differences between the different starting strains (see also Figure 1B) are likely to cause considerable differences in number of generations between the different strains. Do the authors have more precise measurements of number of generations per population? If not, perhaps it should be noted that some lineages may have undergone more doublings than others, and perhaps also discuss if and how this could influence the results?

    In a batch culture regime, where populations are allowed to reach saturation after each dilution, the number of generations at each passage are dictated by the dilution factor (Van den Bergh, Swings, Fauvart, & Michiels, 2018). A dilution of 1:1000 from a saturated culture will allow for approximately 10 generations before populations reach a new saturated phase. As long as saturation is allowed to occur, this number is independent of the fitness of the cultured strains: Slower-dividing strains will simply employ more time to reach saturation after each dilution. At the beginning of the experiment, we had to dilute the ctf4D* rad52D strains being passaged every 48hrs instead of 24hrs. After generation 50, ctf4D** rad52*D strains reached saturation within 24hrs and were then diluted daily. The total count considers the number of passages a culture has undergone, and not the number of days of culture, and thus should guarantee approximately the same number of generations in all three genome architectures.

    Panel A of figure 1A is somewhat confusing; as this seems to indicate that the ctf4∆ was introduced after strains were made, for example, haploid recombination deficient (which is not how these strains were constructed). Perhaps a better way of representing would be to have the indication of DNA replication stress pictured inside the yeast cells.

    We have modified Figure 1A to better represent the way the strains were constructed. For space reasons we have not represented a perturbed fork within each cell, but rather above all of them.

    Legend to Figure 1: is fitness expressed relative to haploid or diploid WT cells for the diploid strains?

    We apologize for having missed this detail in the figure legends. Throughout the figures, haploid and diploid cells were competed against reference strains with the same ploidy. We now add this sentence in Figure 1 and in the materials and methods (line 686).

    Figure 3: to improve readability of this figure, the authors could consider placing the legend of the different symbols (#, *,..) in the figure as well and not just in the figure legend.

    We now include the symbols legend in Figure 3.

    Figure 5 shows Indels, but if I am correct, these mutations are not discussed in the text; nor is it mentioned what the authors used as a cut-off to determine indels (the authors use the term 'small indels' without defining it)? For example, the data shown in Figure 3 and Figure 4 only includes SNPs and not indels (correct?) - but the indels should also be taken into account when investigating which modules are hit.

    Gapped alignments of the relatively long 150 paired-end reads in our data set permits the identification of small indels ranging in size from 1–55 bp using VarScan pileup2indels tool (Koboldt et al., 2012). All small indels (and the respective sequence affected) are listed together with SNPs in Supplementary table 1. Figure 3A, Figure 4 and Figure 5B are representation of ‘gene mutations’ which include both SNPs and small InDels. Large chromosomal Insertion and deletions, not detectable by short read gap alignment are instead identified using the VarScan pileup2copynumber tool (Koboldt et al., 2012), and are represented as amplifications or deletions in Figure 3B and 5C.

    The following sentence has been added to the material and methods at line 629:

    “Gapped alignments of the 150 paired-end reads in our data set permits the identification of small indels ranging in size from 1–55 bp using VarScan pileup2indels tool (Koboldt et al., 2012). All small indels (and the respective sequence affected) are listed together with SNPs in Supplementary table 1.”

    The following definition has been added in Figure legends 3A, 4 and 5A and B.

    “Gene mutations (SNPs and small InDels 1-55bp)”

    Figure 5 mentions: # gene mutations. So these are only the mutations in genes, and not in their up- or downstream regulatory regions?

    We use a broader definition of a gene, not restricted to the open reading frame, and including its regulatory regions. The following definition has been added to figure 5’s legend.

    “Frequency of SNPs and small InDels (1-55bp) affecting genes (Open reading frames and associated regulatory regions).”

    Figure 3-S1: labels of C panels are missing.

    Labels are now included in Figure 3-S1

    Figure 3-S1, panel B: why did the authors focus on synonymous mutations?

    The panel B is commented upon in line 186 and contrasted with panel A to argue that the increased number of mutations detected in ctf4∆ rad52∆ strains is due to a higher mutation rate(which is expected to increase synonymous mutations) instead of an higher number of adaptive mutations (which are less likely to be synonymous) being selected.

    Reviewer #2 (Significance (Required)):

    This is a solid and clearly written study, comprising a large body of work that is generally well presented and that will be of interest to scientists active in the field of (experimental) evolution and replication.

    However, many aspects studied in this manuscript have already been studied and reported before; including the recent eLife paper by the same group, as well as studies by other labs that have investigated how genome architecture / genotype affects evolutionary trajectories, the effect of ploidy on evolution, .... Because of this, I do feel that the authors should put their findings more in the context of existing literature context, including a general description of which results are truly novel, which confirm previous findings and which results seem to go against previous reports. This is already so at some points in the text, but I feel this could be done even more.

    We now rephrase the following paragraphs in our discussion to better highlight the main conclusions in contrast to the existing literature:

    “Engineering one mutation in each module into an ancestral strain lacking Ctf4 is enough to produce the evolved fitness increase in all three genomic architectures. Furthermore, engineering mutations in individual genes confer benefits in all three architectures (Fig. 6A) ,even in those where the mutations in these genes was rare, and combining these mutations recapitulated the evolved fitness increase in all three architectures (Fig. 6B). Altogether our results demonstrate the existence of a common pathway for yeast cells to adapt to a form of constitutive DNA replication stress.” (Lines 409-414)

    “Our results thus go against the trend of slower adaptation in diploids as compared to haploids reported by the majority other studies (A. C. Gerstein, Cleathero, Mandegar, & Otto, 2011; Marad, Buskirk, & Lang, 2018; Zeyl, Vanderford, & Carter, 2003). This effect is not limited to populations experiencing DNA replication stress (Figure 2A) but is also present in control wild-type populations (Figure 2B). Our results support the idea that the details of genotypes, selections, and experimental protocols can determine the effect of ploidy on adaptation.” (Lines 437-442)

    “Our results therefore agree with previous reports observing declining adaptability across strains with different initial fitness but largely fail to observe diminishing return epistasis as a potential justification of this phenomenon. Our experiments and two previous evolutionary repair experiments (Hsieh et al., 2020; Laan et al., 2015) both show interactions that are approximately additive between different selected mutations. The reasons for this difference are currently unknown.” (Lines 450-455)

    Additionally, I think the authors should be more careful not to over-generalize their findings, which come from only a few specific genetic manipulations that might not be representative for general replication stress. For example (p15), can the authors really claim that they have unraveled general principles of adaptation to constitutive DNA replication stress? Perhaps a better motivation of the choice of ctf4 as a model mutation for DNA replication stress could also help (see also my earlier comments). A similar comment applies to the molecular mechanisms affecting adaptation in diploid cells - what evidence do the authors have that their findings are not specific to the one specific type of diploid strain they used in their study? Adding a bit more background information or nuance for some of the claims would help tackle this issue.

    We now followed the suggestions made previously by the reviewer to justify our experimental choices better and to use a language that avoids over-generalizations.

    Field of expertise of this reviewer: genetics, evolution, genomics

    Reviewer #3 (Evidence, reproducibility and clarity (Required)):


    Here the authors carry out an evolution experiment, propagating replicate populations of the budding yeast with the CTF 4 gene deleted in three different genetic backgrounds: haploid , diploid and recombination deficient (RAD52 deletion). The authors find that the rate of evolution depends on the initial fitness of the different genetic backgrounds which is consistent with a repeated finding of evolution experiments: that beneficial mutations tend to have a smaller fitness effect in high fitness genetic backgrounds. Curiously even though the targets of selection tended to be specific to each of the three different genetic backgrounds, genetic reconstruction experiments showed beneficial mutations convert a fitness increase in all genetics backgrounds. The authors go on to provide a plausible explanation for why each of the three genetic backgrounds are predisposed to certain types of beneficial mutations. Overall, these results provide important context and caveats for an emerging consensus that genetic background determines the rate of evolution, a comprehensive molecular breakdown of adaptation to DNA replication stress and a mechanistic explanation for why different beneficial mutations are favoured in diploids, haploids and recombination deficient strains. This is a well-executed study that is beautifully presented and easy to follow. This will be of great interest to those in the experimental evolution community and the data an excellent resource.

    We thank reviewer #3 for emphasizing that reconstructed mutations are beneficial even in architectures where they were not ultimately detected at the end of the experiment. We have now highlighted this point in our conclusions as a response to the reviewer’s #1 and #2 request for more clarity regarding our novel findings.

    “We find that the genes that acquire adaptive mutations, the frequency at which they are mutated, and the frequency at which these mutations are selected all differ between architectures but that mutations that confer strong benefits can occur in all three modules in each architecture. Engineering one mutation in each module into an ancestral strain lacking Ctf4 is enough to produce the evolved fitness increase in all three genomic architectures. Furthermore, reconstruction of a panel of mutations into all three architectures proved they are adaptive even in architectures where the affected genes were not found significantly mutated by the end of the experiment. Altogether our results demonstrate the existence of a common pathway for yeast cells to adapt to a form of constitutive DNA replication stress.” (Lines 405-414)

    **Major comments:**

    • Are the key conclusions convincing? Yes, the convergent evolution analysis, fitness assays, and genetic reconstructions are sufficient to characterise the genetic causes of adaptation in this experiment, and are of the highest standard. The authors do particularly well to fully recover the fitness increases that evolved with their genetic reconstructions, which imparts a completeness to their understanding of what happened in their evolution experiment.
    • Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? No, in nearly all cases the authors make reasonable claims. One exception is on L419 in the discussion, where the authors speculate why some mutations do not follow diminishing returns epistasis, but this idea does not really have any basis (no citation or reasons to suggest that DNA repair genes are less connected with other genes in the genome). If the authors cannot support this statement, it should be removed, and instead write that is currently unknown why some individual mutations do not follow the pattern of diminishing returns.

    On reflection, we agree with the reviewer and now state,

    *“Our results confirm previous reports observing declining adaptability across strains with different initial fitness but largely fail to observe diminishing return epistasis as a potential justification of this phenomenon. Our experiments and two previous evolutionary repair experiments (Hsieh et al., 2020; Laan et al., 2015) both show interactions that are approximately additive between different selected mutations. The reasons for this difference are currently unknown.” *

    A hypothesis, which would need experimental validation, could be that the different mutations have different degrees of epistatic interactions with the rest of the genome. Ixr1, whose mutation follows diminishing return epistasis, is a transcription factor that could in principle affect the expression of many other genes implicated in different cellular modules. Sld5, Scc2 and Rad9 instead, whose mutations have the same effect across different genome architectures, having more mechanistic roles in genome maintenance may have strong epistatic interactions only with a restricted number of cellular modules implicated with DNA metabolism.

    • Would additional experiments be essential to support the claims of the paper? No.
    • Are the data and the methods presented in such a way that they can be reproduced? Yes, but some more details are needed for the convergent evolution analysis, see minor comments.
    • Are the experiments adequately replicated and statistical analysis adequate? Yes, but some more statistic reporting in the main text or figure legends would be helpful, for example. L159: Please report the statistical test, test statistic and p value in the text or in the figure legend. Currently significance is indicated, but the methods do not specify the test.

    We apologize for the lack of clarity in the main text. The test used for all fitness analysis was only reported in the materials and methods as follow:

    “The P-values reported in figures are the result of t-tests assuming unequal variances (Welch’s test)”

    We now include the test and the associated p-value in line 184, and write the above sentence in all the relevant figures.

    This should also be done for the GO analysis shown in figure 3A.

    We thank reviewer #3 pointing out this omission. We now include the following section:

    “Gene ontology (GO) enrichment analysis:

    The list of genes with putatively selected mutations (Figure 3A) or homozygous mutations in diploids (Figure 4) were input as ‘multiple proteins’ in the STRING database, which reports on the network of interactions between the input genes ( The GO term enrichment analysis provided by STRING are reported in Supplementary Table 3 and Supplementary Table 6 respectively. Briefly, the strength of the enrichment is calculated as Log10(O/E), where O is the number of ‘observed’ genes in the provided list (of length N) which belong to the GO-term, and E is the number of ‘expected’ genes we would expect to find matching the GO-term providing a list of the same length N made of randomly picked genes. P-values are computed using a Hypergeometric test and corrected for multiple testing using the Benjamini-Hochberg procedure. The resulting P-values are represented as ‘False discovery rate’ in the supplementary tables and describe the significance of the GO terms enrichment (Franceschini et al., 2013).”

    **Minor comments:**

    • Specific experimental issues that are easily addressable. Not a new experiment, but extra details are required. The authors carried out both clone and whole population sequencing. For their convergent evolution analysis, what is the criteria for a mutation to be included- ie, does it need to be fixed, have attained a certain frequency? This is important- if the criteria were low (say 5%), it would be important to know whether gene A had fixed in 4 populations, while gene B had attained a frequency of 10% in 5 populations. As it stands both would be described as examples of convergent evolution. This can be handled by providing these details in the methods.

    For the population sequencing we disregarded variants found at less than 25% and 35% of the reads in haploid and diploid populations respectively as we observed they were largely the product of alignment errors. All the variants found at frequencies higher than the thresholds indicated were used for the parallel evolution analysis. The frequency at which each individual variant was detected in each population is reported in Supplementary table 1, while the average frequency at which a gene has been found mutated across different populations is reported in Supplementary table 2. The reason why we didn’t solely focus on fixed mutations for our convergent evolution analysis was that from previous work we knew of the existence of clonal interference which kept the frequency of verified adaptive mutations that coexisted in the same population (e.g. ixr1 and sld5) well below 90% (Fumasoni & Murray, 2020).

    For clarity we now add the following sentence in the material and methods:

    “Variants found in less than 25% and 35% of the reads in haploid and diploid populations respectively were discarded, since many of these corresponded to misalignment of repeated regions. For clone sequencing, only variants found in more than 75% of the reads in haploids and 35% of the reads in diploids (to account for heterozygosity) were considered mutations. The frequency of the reads associated with all the variants detected are reported in Supplementary table 1”

    • Are prior studies referenced appropriately? I note that the authors use the term declining adaptability where as other papers use the term diminishing returns epistasis- I am sure the authors have good reasons for their choice of nomenclature but I think it would be helpful for their readers to connect this work to other work by mentioning that declining adaptability is also referred to as diminishing returns.

    We use both terms (for instance in line 446 and line 448) with a different meaning : By ‘declining adaptability’ we refer the phenomenon where more fit strains display lower adaptation rates than less fit ones. By ‘diminishing returns epistasis’ we refer to a possible explanation of such a phenomenon, where adaptive mutations have different fitness effects due to their ‘global’ epistatic interactions with other alleles. It has to be noted that ‘diminishing returns epistasis’ is not the only proposed explanation of the phenomenon of declining adaptability (Couce & Tenaillon, 2015). In our case, we do find evidence of declining adaptability but very limited evidence for diminishing return epistasis (only 1 mutation in 5 has a different fitness effect in different architectures).

    A reference the authors have missed: L419, as well as citing the Desai Lab bioxive paper, they should cite another theory paper that obtained similar conclusions. Lyons, D.M., et al.

    We thank the reviewer for the suggested reference, which is now cited at line 450.

    • Are the text and figures clear and accurate? This paper is beautifully written and easy to follow, a lot of thought has gone into the figures which are aesthetically pleasing and easy to navigate.
    • Do you have suggestions that would help the authors improve the presentation of their data and conclusions? No.


    L32 "do" should be "to" L95 analyzed L219 are the authors referring to ref 15 here? I think so, but please specify

    We thank the reviewer for carefully finding the typos, which are now all corrected.

    Reviewer #3 (Significance (Required)):

    • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. This paper is an important conceptual result and an immediate advance for basic research. The authors have done an outstanding job of showing the potential for the clinical translation of this research, especially regarding cancer biology.
    • Place the work in the context of the existing literature (provide references, where appropriate). This study follows up on and builds upon an earlier paper by these same authors published in E-life in 2020. Conceptually this work is most closely related to work in Michael Desai's, Sergey Kryazhimskiy's, Tim Coopers and Chris Marx's labs work looking at diminishing returns epistasis in yeast, and studies contrasting evolution of haploids and diploids led by Greg Lang's and Sarah Otto's labs.
    • State what audience might be interested in and influenced by the reported findings. This work will be of great interest to the Experimental evolution and molecular evolution communities and also of interest to those who study cancer genomics and DNA replication and repair.
    • Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. Microbial experimental evolution.


