Tajima D test accurately forecasts Omicron / COVID-19 outbreak

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

On 26 November 2021, the World Health Organization designated the SARS-CoV-2 variant B.1.1.529, Omicron, a variant of concern. However, the phylogenetic and evolutionary dynamics of this variant remain unclear. An analysis of the 131 Omicron variant sequences from November 9 to November 28, 2021 reveals that variants have diverged into at least 6 major subgroups. 86.3% of the cases have an insertion at amino acid 214 (INS214EPE) of the spike protein. Neutrality analysis of DH (−2.814, p <0.001) and Zeng’s E (0.0583, p =1.0) tests suggested that directional selection was the major driving force of Omicron variant evolution. The synonymous ( D syn ) and nonsynonymous ( D nonsyn ) polymorphisms of the Omicron variant spike gene were estimated with Tajima’s D statistic to eliminate homogenous effects. Both D ratio ( D nonsyn / D syn , 1.57) and Δ D ( D syn - D nonsyn , 0.63) indicate that purifying selection operates at present. The low nucleotide diversity (0.00008) and Tajima D value (−2.709, p <0.001) also confirms that Omicron variants had already spread in human population for more than the 6 weeks than has been reported. These results, along with our previous analysis of Delta and Lambda variants, also supports the validity of the Tajima’s D test score, with a threshold value as −2.50, as an accurate predictor of new COVID-19 outbreaks.

Article activity feed

  1. SciScore for 10.1101/2021.12.02.21267185: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The radial phylogenetic tree was generated by exporting the tree file in Newick format by MAFFT.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    The FigTree software (version 1.4.2) was used to display the cladogram (Rambaut, 2021).
    FigTree
    suggested: (FigTree, RRID:SCR_008515)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    We and others have shown the limitations to application of the inter-species divergence, the dN/dS (ω) test, for analyzing selection pressure of SARS-CoV-2 previously (Mugal, et al., 2014, Kang et al., 2021, Yeh and Contreras, 2021a, Yeh and Contreras 2021b). Therefore, here we determined the purifying selection using a modified Tajima’s D statistics instead of dN/dS (ω) test. Under purifying selection, the frequency distribution of non-synonymous polymorphisms is negatively skewed relative to the distribution of synonymous polymorphisms. Therefore, it takes more negative values for non-synonymous (Dnonsyn) than for synonymous sites (Dsyn) of a given gene (Hahn et al, 2002, Hughes, et al., 2005, Yeh and Contreras, 2021b). One of the major advantages of Dnonsyn and Dsyn analysis is that it is independent of sample size, which allows us to compare Dnonsyn and Dsyn values among different data sets (Hughes et al., 2008). The same excess of low-frequency alleles in non-synonymous polymorphism was also shown by the Dnonsyn values of the spike (−1.771, p<0.001) and N gene (−1.943,, p<0.005) (Table 1). We conclude that purifying selection led to constraints on the neutral mutations at non-synonymous sites of the spike gene of Omicron variants. Negative values of Tajima D could be caused by a bottleneck event rather than selection, but the bottleneck effect should affect all types of polymorphism equally (Tajima, 1989). Hahn et al. have reported that Dnonsyn is disproportionally lower...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 16. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.