Future COVID19 surges prediction based on SARS-CoV-2 mutations surveillance

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    Najar et al., present a method for the identification of the emergence of new variants prior to the accompanying surge in cases by examining the trend of accumulated non-synonymous mutations from the original Wuhan 2020 SARS-CoV-2 strain. This is an interesting idea but requires additional evidence to establish this as a robust tool for predictions.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

COVID19 has aptly revealed that airborne viruses such as SARS-CoV-2 with the ability to rapidly mutate combined with high rates of transmission and fatality can cause a deadly worldwide pandemic in a matter of weeks (Plato et al., 2021). Apart from vaccines and post-infection treatment options, strategies for preparedness will be vital in responding to the current and future pandemics. Therefore, there is wide interest in approaches that allow predictions of increase in infections (‘surges’) before they occur. We describe here real-time genomic surveillance particularly based on mutation analysis, of viral proteins as a methodology for a priori determination of surge in number of infection cases. The full results are available for SARS-CoV-2 at http://pandemics.okstate.edu/covid19/ , and are updated daily as new virus sequences become available. This approach is generic and will also be applicable to other pathogens.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    This paper details the creation and data behind the website http://pandemics.okstate.edu/covid19/. The authors attempt to explore if there is a cause and effect between the detection of unusually increased mutation activity in the genomic surveillance databases and subsequent near-term surges in SARS-CoV-2 case numbers.

    Overall the premise is interesting as other than following case numbers reported to health authorities and observing what is happening in another country, there is no reliable way to predict when a surge is going to occur. Unfortunately, the data demonstrate that there was no reliable metric that could be used to predict surge events. Interestingly, the website has issued a "surge alert" currently for the month of September. It will be interesting to observe whether their model indeed has predictive power or whether the current analysis is merely coincidental with the surges but not necessarily predictive of them.

    In this work, we investigated a number of metrics for finding a reliable signal of surge prediction. The commonly used ratio ka/ks or the derivative of ka/ks with respect to time did not provide a reliable metric. However for the same data, ka has provided a fairly robust surveillance signal so far. We believe ka/ks studies provide insights into genome changes, but not as a function of short time periods such as days (at least not in the case of SARS-CoV-2). As the motivation of our work is to provide the community with a genomic surveillance approach in real time, we believe that the current data shows that ka is, at present, a useful and fairly reliable metric.

    As the reviewer mentioned, while this manuscript was being reviewed, we issued a warning on September 7th 2022. Several different types of data (including number of new infections, number of hospitalizations, and COVID19 related deaths) has indicated that our warning was accurate since there was a surge in reported number of cases in September and reached a peak in October. For instance, plots shown in Figure S6 indicate that there was a surge in number of cases around Europe at large, and several individual countries including France, United Kingdom, Germany and Italy. Similarly our earlier warning in June also was followed by surges being reported across many countries and collectively across the world (Figure S5). Therefore, we believe the presented methodology has been validated.

    Reviewer #2 (Public Review):

    In this manuscript, Najer et al., perform a comprehensive bioinformatic analysis of SARS-CoV-2 sequences available from public repositories. Through a comparison with the genome sequence of the original Wuhan 2020 strain, they identify the total accumulation of non-synonymous mutations as a predictor of the evolution of new strains. The manuscript provides data for three structural proteins - spike (S), membrane (M), and envelope (E) proteins, as well as data for the non-structural RNA-dependent RNA polymerase (RDRp) protein that serves as a negative control. However, the predictivity of this approach is most marked only for the Omicron variant, with considerable variation in the predictive power of SARS-CoV-2 proteins for other variants. Focusing on a spike, the method does not detect the alpha variant or delta variant surges, which were mostly driven by changes in spike protein, although the level of sequencing data available for the delta variant might have been less. Notably, although the authors conclude that other parameters such as the ratio of non-synonymous to synonymous mutations or the rate of accumulation of non-synonymous mutations are not predictive, they appear to have similar success in predicting the omicron surge.

    We agree with the reviewer, the case of spike protein during the Alpha surge could have been affected by insufficient number of sequences. In case of Gamma/Delta variants, we did notice changes in the spike and the membrane protein. For the case of Omicron and its various sub-variants, the use of ka provides a reliable signal due to changes in the spike, membrane and envelope proteins.

  2. eLife assessment

    Najar et al., present a method for the identification of the emergence of new variants prior to the accompanying surge in cases by examining the trend of accumulated non-synonymous mutations from the original Wuhan 2020 SARS-CoV-2 strain. This is an interesting idea but requires additional evidence to establish this as a robust tool for predictions.

  3. Reviewer #1 (Public Review):

    This paper details the creation and data behind the website http://pandemics.okstate.edu/covid19/. The authors attempt to explore if there is a cause and effect between the detection of unusually increased mutation activity in the genomic surveillance databases and subsequent near-term surges in SARS-CoV-2 case numbers.

    Overall the premise is interesting as other than following case numbers reported to health authorities and observing what is happening in another country, there is no reliable way to predict when a surge is going to occur. Unfortunately, the data demonstrate that there was no reliable metric that could be used to predict surge events. Interestingly, the website has issued a "surge alert" currently for the month of September. It will be interesting to observe whether their model indeed has predictive power or whether the current analysis is merely coincidental with the surges but not necessarily predictive of them.

  4. Reviewer #2 (Public Review):

    In this manuscript, Najer et al., perform a comprehensive bioinformatic analysis of SARS-CoV-2 sequences available from public repositories. Through a comparison with the genome sequence of the original Wuhan 2020 strain, they identify the total accumulation of non-synonymous mutations as a predictor of the evolution of new strains. The manuscript provides data for three structural proteins - spike (S), membrane (M), and envelope (E) proteins, as well as data for the non-structural RNA-dependent RNA polymerase (RDRp) protein that serves as a negative control. However, the predictivity of this approach is most marked only for the Omicron variant, with considerable variation in the predictive power of SARS-CoV-2 proteins for other variants. Focusing on a spike, the method does not detect the alpha variant or delta variant surges, which were mostly driven by changes in spike protein, although the level of sequencing data available for the delta variant might have been less. Notably, although the authors conclude that other parameters such as the ratio of non-synonymous to synonymous mutations or the rate of accumulation of non-synonymous mutations are not predictive, they appear to have similar success in predicting the omicron surge.