Future COVID19 surges prediction based on SARS-CoV-2 mutations surveillance

Fares Z Najar
Evan Linde
Chelsea L Murphy
Veniamin A Borin
Huan Wang
Shozeb Haider
Pratul K Agarwal

Curated by eLife

eLife assessment

Najar et al., present a method for the identification of the emergence of new variants prior to the accompanying surge in cases by examining the trend of accumulated non-synonymous mutations from the original Wuhan 2020 SARS-CoV-2 strain. This is an interesting idea but requires additional evidence to establish this as a robust tool for predictions.

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (eLife)
Preprints for generating images with AI (BlueReZZ)

Abstract

COVID19 has aptly revealed that airborne viruses such as SARS-CoV-2 with the ability to rapidly mutate combined with high rates of transmission and fatality can cause a deadly worldwide pandemic in a matter of weeks (Plato et al., 2021). Apart from vaccines and post-infection treatment options, strategies for preparedness will be vital in responding to the current and future pandemics. Therefore, there is wide interest in approaches that allow predictions of increase in infections (‘surges’) before they occur. We describe here real-time genomic surveillance particularly based on mutation analysis, of viral proteins as a methodology for a priori determination of surge in number of infection cases. The full results are available for SARS-CoV-2 at http://pandemics.okstate.edu/covid19/ , and are updated daily as new virus sequences become available. This approach is generic and will also be applicable to other pathogens.

Version published to 10.7554/elife.82980 on eLife
Jan 19, 2023
eLife
Nov 28, 2022

Author Response

Reviewer #1 (Public Review):

This paper details the creation and data behind the website http://pandemics.okstate.edu/covid19/. The authors attempt to explore if there is a cause and effect between the detection of unusually increased mutation activity in the genomic surveillance databases and subsequent near-term surges in SARS-CoV-2 case numbers.

Overall the premise is interesting as other than following case numbers reported to health authorities and observing what is happening in another country, there is no reliable way to predict when a surge is going to occur. Unfortunately, the data demonstrate that there was no reliable metric that could be used to predict surge events. Interestingly, the website has issued a "surge alert" currently for the month of September. It will be interesting to observe whether their …

Author Response

Reviewer #1 (Public Review):

This paper details the creation and data behind the website http://pandemics.okstate.edu/covid19/. The authors attempt to explore if there is a cause and effect between the detection of unusually increased mutation activity in the genomic surveillance databases and subsequent near-term surges in SARS-CoV-2 case numbers.

Overall the premise is interesting as other than following case numbers reported to health authorities and observing what is happening in another country, there is no reliable way to predict when a surge is going to occur. Unfortunately, the data demonstrate that there was no reliable metric that could be used to predict surge events. Interestingly, the website has issued a "surge alert" currently for the month of September. It will be interesting to observe whether their model indeed has predictive power or whether the current analysis is merely coincidental with the surges but not necessarily predictive of them.

In this work, we investigated a number of metrics for finding a reliable signal of surge prediction. The commonly used ratio ka/ks or the derivative of ka/ks with respect to time did not provide a reliable metric. However for the same data, ka has provided a fairly robust surveillance signal so far. We believe ka/ks studies provide insights into genome changes, but not as a function of short time periods such as days (at least not in the case of SARS-CoV-2). As the motivation of our work is to provide the community with a genomic surveillance approach in real time, we believe that the current data shows that ka is, at present, a useful and fairly reliable metric.

As the reviewer mentioned, while this manuscript was being reviewed, we issued a warning on September 7th 2022. Several different types of data (including number of new infections, number of hospitalizations, and COVID19 related deaths) has indicated that our warning was accurate since there was a surge in reported number of cases in September and reached a peak in October. For instance, plots shown in Figure S6 indicate that there was a surge in number of cases around Europe at large, and several individual countries including France, United Kingdom, Germany and Italy. Similarly our earlier warning in June also was followed by surges being reported across many countries and collectively across the world (Figure S5). Therefore, we believe the presented methodology has been validated.

Reviewer #2 (Public Review):

In this manuscript, Najer et al., perform a comprehensive bioinformatic analysis of SARS-CoV-2 sequences available from public repositories. Through a comparison with the genome sequence of the original Wuhan 2020 strain, they identify the total accumulation of non-synonymous mutations as a predictor of the evolution of new strains. The manuscript provides data for three structural proteins - spike (S), membrane (M), and envelope (E) proteins, as well as data for the non-structural RNA-dependent RNA polymerase (RDRp) protein that serves as a negative control. However, the predictivity of this approach is most marked only for the Omicron variant, with considerable variation in the predictive power of SARS-CoV-2 proteins for other variants. Focusing on a spike, the method does not detect the alpha variant or delta variant surges, which were mostly driven by changes in spike protein, although the level of sequencing data available for the delta variant might have been less. Notably, although the authors conclude that other parameters such as the ratio of non-synonymous to synonymous mutations or the rate of accumulation of non-synonymous mutations are not predictive, they appear to have similar success in predicting the omicron surge.

We agree with the reviewer, the case of spike protein during the Alpha surge could have been affected by insufficient number of sequences. In case of Gamma/Delta variants, we did notice changes in the spike and the membrane protein. For the case of Omicron and its various sub-variants, the use of ka provides a reliable signal due to changes in the spike, membrane and envelope proteins.

Read the original source
eLife
Nov 21, 2022

eLife assessment

Najar et al., present a method for the identification of the emergence of new variants prior to the accompanying surge in cases by examining the trend of accumulated non-synonymous mutations from the original Wuhan 2020 SARS-CoV-2 strain. This is an interesting idea but requires additional evidence to establish this as a robust tool for predictions.

Read the original source
eLife
Nov 21, 2022

Reviewer #1 (Public Review):

This paper details the creation and data behind the website http://pandemics.okstate.edu/covid19/. The authors attempt to explore if there is a cause and effect between the detection of unusually increased mutation activity in the genomic surveillance databases and subsequent near-term surges in SARS-CoV-2 case numbers.

Overall the premise is interesting as other than following case numbers reported to health authorities and observing what is happening in another country, there is no reliable way to predict when a surge is going to occur. Unfortunately, the data demonstrate that there was no reliable metric that could be used to predict surge events. Interestingly, the website has issued a "surge alert" currently for the month of September. It will be interesting to observe whether their model indeed has …

Reviewer #1 (Public Review):

This paper details the creation and data behind the website http://pandemics.okstate.edu/covid19/. The authors attempt to explore if there is a cause and effect between the detection of unusually increased mutation activity in the genomic surveillance databases and subsequent near-term surges in SARS-CoV-2 case numbers.

Overall the premise is interesting as other than following case numbers reported to health authorities and observing what is happening in another country, there is no reliable way to predict when a surge is going to occur. Unfortunately, the data demonstrate that there was no reliable metric that could be used to predict surge events. Interestingly, the website has issued a "surge alert" currently for the month of September. It will be interesting to observe whether their model indeed has predictive power or whether the current analysis is merely coincidental with the surges but not necessarily predictive of them.

Read the original source
eLife
Nov 21, 2022

Reviewer #2 (Public Review):

In this manuscript, Najer et al., perform a comprehensive bioinformatic analysis of SARS-CoV-2 sequences available from public repositories. Through a comparison with the genome sequence of the original Wuhan 2020 strain, they identify the total accumulation of non-synonymous mutations as a predictor of the evolution of new strains. The manuscript provides data for three structural proteins - spike (S), membrane (M), and envelope (E) proteins, as well as data for the non-structural RNA-dependent RNA polymerase (RDRp) protein that serves as a negative control. However, the predictivity of this approach is most marked only for the Omicron variant, with considerable variation in the predictive power of SARS-CoV-2 proteins for other variants. Focusing on a spike, the method does not detect the alpha variant or …

Reviewer #2 (Public Review):

In this manuscript, Najer et al., perform a comprehensive bioinformatic analysis of SARS-CoV-2 sequences available from public repositories. Through a comparison with the genome sequence of the original Wuhan 2020 strain, they identify the total accumulation of non-synonymous mutations as a predictor of the evolution of new strains. The manuscript provides data for three structural proteins - spike (S), membrane (M), and envelope (E) proteins, as well as data for the non-structural RNA-dependent RNA polymerase (RDRp) protein that serves as a negative control. However, the predictivity of this approach is most marked only for the Omicron variant, with considerable variation in the predictive power of SARS-CoV-2 proteins for other variants. Focusing on a spike, the method does not detect the alpha variant or delta variant surges, which were mostly driven by changes in spike protein, although the level of sequencing data available for the delta variant might have been less. Notably, although the authors conclude that other parameters such as the ratio of non-synonymous to synonymous mutations or the rate of accumulation of non-synonymous mutations are not predictive, they appear to have similar success in predicting the omicron surge.

Read the original source
Version published to 10.1101/2022.09.05.506640 on bioRxiv
Sep 7, 2022

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

This article has 15 authors:
1. Pulchérie Pelembi
2. Philippe Colson
3. Alain Farra
4. Ornella Anne Sibiro-Demi
5. Christian Noël Malaka
6. Aurélia Kwasiborski
7. Véronique Hourdel
8. Gilles Landry Ngaya
9. Romaric Nzoumbou-Boko
10. Jean-Claude Manuguerra
11. Emmanuel Ryvalin Nakoune-Yandoko
12. Guy VERNET
13. Bernard La Scola
14. Valérie Caro
15. Alexandre Manirakiza
This article has no evaluationsLatest version Jan 19, 2026
It Takes Two to Tango: SARS-CoV-2 and Influenza Co-Circulation and Co-Vaccination

This article has 4 authors:
1. Mohammad Kamransarkandi
2. Elena A. Varyushina
3. Andrey N. Gorshkov
4. Marina A. Stukova
This article has no evaluationsLatest version Jan 14, 2026
DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

This article has 13 authors:
1. Claudia Carranza
2. Lucia Ortiz
3. Maria Eugenia Castellanos
4. Ana Silvia Gonzalez-Reiche
5. Renata Mendizabal-Cabrera
6. Zain Khalil
7. Adriana van DeGuchte
8. Keith Farrugia
9. Mariana Herrera
10. Ernesto Mena
11. Celia Cordon-Rosales
12. Harm van Bakel
13. Daniel R. Perez
Reviewed by Access Microbiology

This article has 3 evaluationsLatest version Feb 3, 2026Latest activity Jul 20, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

It Takes Two to Tango: SARS-CoV-2 and Influenza Co-Circulation and Co-Vaccination

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA