Regulation of replication timing in Saccharomyces cerevisiae
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (Review Commons)
Abstract
In order to maintain genomic integrity, DNA replication must be highly coordinated. Disruptions in this process can cause replication stress which is aberrant in many pathologies including cancer. Despite this, little is known about the mechanisms governing the temporal regulation of DNA replication initiation, thought to be related to the limited copy number of firing factors. Here, we present a high (1-kilobase) resolution stochastic model of Saccharomyces cerevisiae whole-genome replication in which origins compete to associate with limited firing factors. After developing an algorithm to fit this model to replication timing data, we validated the model by reproducing experimental inter-origin distances, origin efficiencies, and replication fork directionality. This suggests the model accurately simulates the aspects of DNA replication most important for determining its dynamics. We also use the model to predict measures of DNA replication dynamics which are yet to be determined experimentally and investigate the potential impacts of variations in firing factor concentrations on DNA replication.
Article activity feed
-
Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.
Learn more at Review Commons
Reply to the reviewers
Reply to the Reviewers
We are very grateful to the reviewers for their time and care in reviewing our manuscript. We have tried to incorporate all of their feedback to the best of our ability, and we feel that this has greatly improved the manuscript.
Reviewer #1
This study provides a strong support for the relationship between replication starting point competition and initial factor concentration. However, some predictive conclusions, such as "the origin of high efficiency may not be activated earlier", are still preliminary. Can the author further clarify the scope of these predictions and any potential mechanism in the discussion part to improve …
Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.
Learn more at Review Commons
Reply to the reviewers
Reply to the Reviewers
We are very grateful to the reviewers for their time and care in reviewing our manuscript. We have tried to incorporate all of their feedback to the best of our ability, and we feel that this has greatly improved the manuscript.
Reviewer #1
This study provides a strong support for the relationship between replication starting point competition and initial factor concentration. However, some predictive conclusions, such as "the origin of high efficiency may not be activated earlier", are still preliminary. Can the author further clarify the scope of these predictions and any potential mechanism in the discussion part to improve the rigor of this study?
__Response: __In the discussion, we now emphasize the complexity of predicting origin firing time distributions, which are influenced by multiple interrelated factors beyond efficiency alone.
The resolution and accuracy of the model prediction are obvious to all, but the specific generalization ability is still unknown, which makes the further promotion slightly insufficient. Does the author consider conducting additional experiments? To detect the replication time and efficiency in yeast cells with changed levels of key initiation factors (such as Cdc45 or Dpb11). The empirical data can be compared with the model prediction by editing CRISPR gene or manipulating the initial factor abundance through overexpression vector.
__Response: __We fully agree that this would be a very interesting direction, but as this is a theoretical study focused on mathematical modelling, conducting further wet lab experiments would be beyond the scope of this work.
The model currently uses single values for the initiation factor number and recycling rate, though these parameters may vary across cell cycles or under different growth conditions. It is suggested that sensitivity analysis should be added to supplementary materials to explore how the changes of these parameters affect the model output, such as replication time distribution and origin efficiency.
__Response: __Sensitivity analysis of how the model fit and validation is affected by using different recycling rates and initial firing factor counts will be conducted.
While the authors use mean absolute error (MAE) to assess model fit, it is suggested to add other statistical methods, such as root mean square error or correlation analysis, to further evaluate the model's accuracy and robustness. In addition, this model lacks comparison with other studies on fitting yeast replication time, and it is difficult to evaluate the effect of this model compared with other models from the specific performance.
__Response: __We have now included the root mean squared error (RMSE) alongside the mean absolute error (MAE) and R-squared value to compare the simulated replication timing profiles with the experimental data. We agree that we could have been more detailed in comparing our model to other approaches. We have now added a lengthened discussion of this. In some cases, a direct comparison of performance is difficult due to fundamental differences between the approaches, but we have highlighted why this is the case.
Although the code is open, it is suggested to provide specific instructions or examples of the running code in supplementary materials, so as to facilitate reproduction and application by other researchers.
__Response: __The GitHub repository will be updated to enable the running of the entire pipeline. This update will include code for processing replication timing data from Müller et al. (2014) and extracting origin positions from the OriDB. Code will also be provided for writing Beacon Calculus scripts with different parameters and origin firing rates. Instructions on the recommended sequence in which scripts should be executed will also be provided. To enable users to run the model locally on their own computers, a smaller version focused on chromosome 2 will be included in the supplementary information and GitHub repository, along with example input data and expected outputs.
In Figure 2(a), compared with other chromosomes, the fitting effect of chromosome 1 seems to be not good. Has the author ever thought about the reason? In addition, what is the guiding significance of this model in practical applications, such as online services, forecasting tools, or experiments? Can the author give relevant application examples in this regard?
__Response: __Potential explanations for the poorer fit of the replication timing profile for chromosome 1 are now discussed. The y-axis range has also now been set as the same for all subplots in Figure 2a to make the replication timing profiles for each chromosome more easily comparable. In the discussion, we highlight how the intuitive and flexible nature of the model places it as a valuable tool which could be adapted to predict the effect of different perturbations on DNA replication dynamics.
Reviewer #2
In figure 5, the authors demonstrate that replication dynamics are robust to an increase in the number of available firing factors. However, experimental data from strains in which these limiting factors are overexpressed indicate that replication dynamics are substantially altered (e.g. PMID 22081107 and 23562327) since dNTPs become limiting. So the conclusions of the analysis in figure 5 are at best an oversimplification and at worst rather misleading. If adding dNTPs as a factor that becomes limiting only at higher firing factor concentrations is not technically feasible, the authors should be more circumspect in their description and discussion of the results in figure 5.
__Response: __We now discuss the interpretation of the effect of increasing the number of firing factors, given that factors such as dNTP availability are not included in the model.
The analysis of replication dynamics appears to exclude origins within the rDNA, which in the average strain account for ~20-25% of all replication origins in S. cerevisiae depending on the origin list chosen. Ignoring this large number of origins likely has a substantial impact on the model: if rDNA origins are intentionally ignored due to the difficulty of modeling repetitive regions or of having multiple identical origins in the competition model, this should be explicitly addressed in the text.
__Response: __We now emphasize that our model restricts initiation to specific sites and note that some low-efficiency origins, such as those in rDNA, have not been included.
Reviewer #3
Can the authors provide some insight into the model's dependency on the Müller, 2014 replication data set? They initialize and converge to this dataset so this paper's findings are highly contingent on treating this data set as ground truth.
__Response: __In the discussion, we now highlight that, despite the model's reliance on the Müller, 2014 replication data set for fitting, its ability to reproduce other features of DNA replication demonstrates its ability to reflect DNA replication dynamics more broadly.
The authors describe their model as one that simplifies the origin firing mechanisms compared to more complex models. Is there a direct comparison available that can quantify this advantage? Likewise, how does their model compare to a naive discriminative model, such as one that performs peak finding on the replication timing data. For example, the replication fork directionality can be estimated, naively, using a peak finding algorithm. This type of analysis will provide a stronger argument for the usage of their model.
__Response: __Quantitative comparisons between our model and other published models are challenging due to differences in underlying assumptions and metrics used to assess goodness of fit. However, we have now added a discussion addressing these challenges and highlighting how our model's design contrasts with that of other models.
Currently the code is available as supplemental data. Ideally, the code should be available and provided to run the entire pipeline beginning with the initialization of the origin firing program from the Müller, 2014 data set.
__Response: __The GitHub repository will be updated to enable the running of the entire pipeline. This update will include code for processing replication timing data from Müller et al. (2014) and extracting origin positions from the OriDB.
The authors mention that origin firing factors and their recycling time to be the basis of how this model is constructed. While also describing the recycle time as a general timing delay that is dependent on a number of reasons such as diffusion and replisome complex formation. Can the authors discuss the limitations of their model towards this simplification?
__Response: __Limitations of our model's assumptions of constant recycling rates of firing factors are now discussed, as well as our assumption that the firing rates of origins and the maximum number of available firing factors remain constant between simulations.
-
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #3
Evidence, reproducibility and clarity
Summary:
In this paper, the authors create a model of origin replication in yeast using Beacon calculus and a small set of parameters. The model is described as the relationship between origin firing rate and the abundance and recycling of origin firing factors. Using the (Müller, 2014) replication timing data to initialize and fit their model, the authors show that their model recapitulates known replication-related work such as inter-origin distances, replication fork directionality, and origin efficiency. Next, they utilize their model to make predictions that characterize the broader replication program, such as in the …
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #3
Evidence, reproducibility and clarity
Summary:
In this paper, the authors create a model of origin replication in yeast using Beacon calculus and a small set of parameters. The model is described as the relationship between origin firing rate and the abundance and recycling of origin firing factors. Using the (Müller, 2014) replication timing data to initialize and fit their model, the authors show that their model recapitulates known replication-related work such as inter-origin distances, replication fork directionality, and origin efficiency. Next, they utilize their model to make predictions that characterize the broader replication program, such as in the quantification of active replication forks, replicons, and replication timing.
Major comments:
Can the authors provide some insight into the model's dependency on the Müller, 2014 replication data set? They initialize and converge to this dataset so this paper's findings are highly contingent on treating this data set as ground truth.
The authors describe their model as one that simplifies the origin firing mechanisms compared to more complex models. Is there a direct comparison available that can quantify this advantage?
Likewise, how does their model compare to a naive discriminative model, such as one that performs peak finding on the replication timing data. For example, the replication fork directionality can be estimated, naively, using a peak finding algorithm. This type of analysis will provide a stronger argument for the usage of their model.
Currently the code is available as supplemental data. Ideally, the code should be available and provided to run the entire pipeline beginning with the initialization of the origin firing program from the Müller, 2014 data set.
The authors mention that origin firing factors and their recycling time to be the basis of how this model is constructed. While also describing the recycle time as a general timing delay that is dependent on a number of reasons such as diffusion and replisome complex formation. Can the authors discuss the limitations of their model towards this simplification?
Minor comments:
The author describes the prediction of 200 active replication forks 22 minutes into S phase. Please discuss why this peak number of active replication forks may have been reached. Is this related to the model configured for the number of firing factors F = 200?
The recycling parameter appears to be very important for this model. A sensitivity analysis of the value of 0.05 would be helpful to understand why this value was chosen.
It would be helpful to understand the convergence of the model better. Can the authors provide insight or a plot to better understand why the convergence parameter alpha was chosen as 1.2?
The authors comment that simulated origin efficiencies were estimated close to zero (6.2%{plus minus}22%). Can the authors comment on the large variability in this estimation (the {plus minus}22%)?
Significance
General Assessment
The strength of the model is in summarizing the origin efficiency firing mechanism into a small set of parameters. This also relates to its limitations. The model asserts that the origin firing depends solely on the abundance and recycling of origin firing factors. This limits the scope of the interpretation of the mechanisms of origin firing compared to more complex models.
Additionally, the model is fit to, and thus, highly dependent on the quality of the Müller, 2014 dataset.
Improvements:
This work can be improved by comparing and contrasting their results to existing models where they argue the advantages of employing a simpler model for origin firing compared to more complex ones they cite (Arbona, 2018; de Moura, 2010; Retkute, 2014; Brümmer, 2010).
While their modeling and dependency on the Müller, 2024 replication timing data may be sufficient, some of the findings can be naively characterized from this data set, such as in replication fork direction and origin firing times. Thus, the authors can argue the strengths of their model by contrasting theirs to more simpler and naive quantifications.
Currently the paper is very descriptive. A nice addition would be to model the effects of Rpd3 deletion which is thought to either have a direct effect on late origins (advancing their time of replication) or an indirect effect via the rDNA locus which may, in the absence of rpd3) act as sink for limiting replication factors. (Vogelauer et al., Mol Cell, 2002; Yoshida et al.,Mol Cell 2014, He et al., PNAS 2022). Specifically, how does titrating the number of active rDNA origins out of the ~150 available rDNA origins impact global origin usage under this model?
Scope:
Audience: Specialized towards groups modeling and studying replication.
Reviewer's field of expertise: Computer science, computational biology, bioinformatics, and general computational modeling
-
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #2
Evidence, reproducibility and clarity
In this manuscript, Berners-Lee et al extend the beacon calculus approach previously developed by the Boemo lab to model the dynamics of Saccharomyces cerevisiae genome duplication at high resolution, based on competition for limiting origin firing factors. The simulations converge to produce a timing profile that closely matches experimentally determined replication dynamics through the genome. In an extension, the authors model how an increase in firing factor availability (assuming abundant dNTPs) would affect replication dynamics and conclude that overall timing would be robust.
Major comments
In figure 5, the authors …
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #2
Evidence, reproducibility and clarity
In this manuscript, Berners-Lee et al extend the beacon calculus approach previously developed by the Boemo lab to model the dynamics of Saccharomyces cerevisiae genome duplication at high resolution, based on competition for limiting origin firing factors. The simulations converge to produce a timing profile that closely matches experimentally determined replication dynamics through the genome. In an extension, the authors model how an increase in firing factor availability (assuming abundant dNTPs) would affect replication dynamics and conclude that overall timing would be robust.
Major comments
In figure 5, the authors demonstrate that replication dynamics are robust to an increase in the number of available firing factors. However, experimental data from strains in which these limiting factors are overexpressed indicate that replication dynamics are substantially altered (e.g. PMID 22081107 and 23562327) since dNTPs become limiting. So the conclusions of the analysis in figure 5 are at best an oversimplification and at worst rather misleading. If adding dNTPs as a factor that becomes limiting only at higher firing factor concentrations is not technically feasible, the authors should be more circumspect in their description and discussion of the results in figure 5.
The analysis of replication dynamics appears to exclude origins within the rDNA, which in the average strain account for ~20-25% of all replication origins in S. cerevisiae depending on the origin list chosen. Ignoring this large number of origins likely has a substantial impact on the model: if rDNA origins are intentionally ignored due to the difficulty of modeling repetitive regions or of having multiple identical origins in the competition model, this should be explicitly addressed in the text
Minor comment
Sekedat et al (2010, PMID PMID: 20212525) demonstrated convincingly that replication-fork movement is uniform throughout the genome but are not cited in favor of more recent work.
Significance
This manuscript will be of interest to researchers working on DNA replication dynamics, since the methodology and conclusions could be extended to other genomes for which high-quality replication timing data are available. The technical advance of including limiting firing factor availability is interesting, although the overall utility of these models is perhaps somewhat limited by the need for experimental data on which the model can converge. Extending the model to include known additional factors affecting replication-fork movement and replication timing as outlined above would extend the significance, especially since variations in replication-fork speed are associated with genome instability (e.g. PMID 29950726), differentiation (e.g PMID 35256805) and other biologically important phenomena.
Expertise: molecular biology, high-throughput analysis of DNA replication. I do not have sufficient expertise to evaluate the mathematical model itself.
-
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #1
Evidence, reproducibility and clarity
Summary
This study develops a high-resolution stochastic model to explore DNA replication timing regulation in Saccharomyces cerevisiae, specifically focusing on competition between replication origins for limited initiation factors. The model, based on "Beacon Calculus," utilizes an iterative optimization process to fit experimental data, successfully reproducing timing, efficiency, and directionality features of genome replication origins. Additionally, the authors use the model to make predictions on replication dynamics under varying initiation factor concentrations, providing new insights into DNA replication processes that …
Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.
Learn more at Review Commons
Referee #1
Evidence, reproducibility and clarity
Summary
This study develops a high-resolution stochastic model to explore DNA replication timing regulation in Saccharomyces cerevisiae, specifically focusing on competition between replication origins for limited initiation factors. The model, based on "Beacon Calculus," utilizes an iterative optimization process to fit experimental data, successfully reproducing timing, efficiency, and directionality features of genome replication origins. Additionally, the authors use the model to make predictions on replication dynamics under varying initiation factor concentrations, providing new insights into DNA replication processes that have not yet been observed empirically or experimentally.
Major Comments:
- This study provides a strong support for the relationship between replication starting point competition and initial factor concentration. However, some predictive conclusions, such as "the origin of high efficiency may not be activated earlier", are still preliminary. Can the author further clarify the scope of these predictions and any potential mechanism in the discussion part to improve the rigor of this study?
- The resolution and accuracy of the model prediction are obvious to all, but the specific generalization ability is still unknown, which makes the further promotion slightly insufficient. Does the author consider conducting additional experiments? To detect the replication time and efficiency in yeast cells with changed levels of key initiation factors (such as Cdc45 or Dpb11). The empirical data can be compared with the model prediction by editing CRISPR gene or manipulating the initial factor abundance through overexpression vector.
- The model currently uses single values for the initiation factor number and recycling rate, though these parameters may vary across cell cycles or under different growth conditions. It is suggested that sensitivity analysis should be added to supplementary materials to explore how the changes of these parameters affect the model output, such as replication time distribution and origin efficiency.
- While the authors use mean absolute error (MAE) to assess model fit, it is suggested to add other statistical methods, such as root mean square error or correlation analysis, to further evaluate the model's accuracy and robustness. In addition, this model lacks comparison with other studies on fitting yeast replication time, and it is difficult to evaluate the effect of this model compared with other models from the specific performance.
- Although the code is open, it is suggested to provide specific instructions or examples of the running code in supplementary materials, so as to facilitate reproduction and application by other researchers.
- In Figure 2(a), compared with other chromosomes, the fitting effect of chromosome 1 seems to be not good. Has the author ever thought about the reason? In addition, what is the guiding significance of this model in practical applications, such as online services, forecasting tools, or experiments? Can the author give relevant application examples in this regard?
Minor Comments:
- Suggestions for Improving Figures: Figures 2 and 3: It is suggested that the differences between experimental data and model fitting data should be clearly marked by using more distinctive colors or symbols with different shapes in these figures, so as to help readers quickly distinguish between simulation results and experimental observation results. Density Plot in Figure 4: The current color gradient is dense, making it difficult to differentiate activation distributions for different origins. Consider using a broader color gradient or adding a slight separation between each origin's curve to improve readability.
- Model Parameter Table: Adding a table in the Methods section or supplementary materials that summarizes the main model parameters (e.g., number of initiation factors, recycling rate, replication speed) and the basis for each parameter's setting would be helpful. This will allow readers to quickly understand the model setup and provide a reference for future researchers who may wish to use or adjust this model.
- Citation and Description of Experimental Data: Clarify the origin and characteristics of the experimental data used, such as the specific details of the replication timing dataset applied for model fitting, and indicate whether the data represents single-cell or population-averaged measurements. This information will help readers better understand the comparison between the model and actual data.
- Background and References: In the Introduction, consider adding a brief explanation of "Beacon Calculus" to aid non-specialist readers in understanding the novelty and applicability of this method. Adding foundational references for Beacon Calculus would further help readers appreciate the advantages of this approach. Additionally, in the discussion of the model's suitability for other biological systems, citing some reviews on high-efficiency replication origin analyses would help demonstrate the model's broader applicability.
Significance
- Significance of the Research:
This study advances our understanding of DNA replication timing regulation in S. cerevisiae and presents a mathematical modeling approach with theoretical importance. By reconstructing a DNA replication timing framework for yeast, the model also provides a foundation that could be adapted for other systems, potentially advancing modeling techniques in genome replication research.
- Relation to Existing Literature:
This study builds upon prior research on S. cerevisiae DNA replication initiation and proposes a simplified, reproducible model. Compared to more complex mathematical models or large-scale data analyses, this approach is more interpretable and easier to reproduce. The study's predictions on initiation factor concentration effects provide another perspective for future experimental work.
- Target Audience:
This work will influence researchers studying DNA replication regulation, yeast genomics, and bioinformatics modeling. Additionally, scholars in microbiology and genetics may also benefit from the innovative modeling methods introduced.
- Reviewer Expertise:
My expertise includes computational biology and bioinformatics, with a professional knowledge in DNA replication origins and bioinformatics modeling.
-