Generalized linear models provide a measure of virulence for specific mutations in SARS-CoV-2 strains

Abstract

This study aims to highlight SARS-COV-2 mutations which are associated with increased or decreased viral virulence. We utilize genetic data from all strains available from GISAID and countries’ regional information, such as deaths and cases per million, as well as COVID-19-related public health austerity measure response times. Initial indications of selective advantage of specific mutations can be obtained from calculating their frequencies across viral strains. By applying modelling approaches, we provide additional information that is not evident from standard statistics or mutation frequencies alone. We therefore, propose a more precise way of selecting informative mutations. We highlight two interesting mutations found in genes N (P13L) and ORF3a (Q57H). The former appears to be significantly associated with decreased deaths and cases per million according to our models, while the latter shows an opposing association with decreased deaths and increased cases per million. Moreover, protein structure prediction tools show that the mutations infer conformational changes to the protein that significantly alter its structure when compared to the reference protein.

SciScore for 10.1101/2020.08.17.253484: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
phangorn, lme4, dfoptim, car, reshape2, ggplot2, gridExtra, PredictABEL, dplyr, tidyr, scales, ggpubr.	ggplot2 suggested: (ggplot2, RRID:SCR_014601)
I-TASSER was selected for protein structure modelling, since it outperformed other servers according to results from the 13th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)[30]	I-TASSER suggested: (I-TASSER, RRID:SCR_014627)
The PyMOL software (https://pymol.org/2/) was used for the visualization of the protein molecules.	PyMOL suggested: (PyMOL, RRID:SCR_000305)
Protein-protein complexes …

SciScore for 10.1101/2020.08.17.253484: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
phangorn, lme4, dfoptim, car, reshape2, ggplot2, gridExtra, PredictABEL, dplyr, tidyr, scales, ggpubr.	ggplot2 suggested: (ggplot2, RRID:SCR_014601)
I-TASSER was selected for protein structure modelling, since it outperformed other servers according to results from the 13th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP13)[30]	I-TASSER suggested: (I-TASSER, RRID:SCR_014627)
The PyMOL software (https://pymol.org/2/) was used for the visualization of the protein molecules.	PyMOL suggested: (PyMOL, RRID:SCR_000305)
Protein-protein complexes were constructed using the ClusPro (v2.0)[33] and HDOCK[31] algorithms and binding affinities were calculated using the PRODIGY webserver[34].	ClusPro suggested: (ClusPro, RRID:SCR_018248)
The DynaMut webserver[36], was used to visualize non-covalent molecular interactions, calculated by the Arpeggio algorithm[37].	Arpeggio suggested: (Arpeggio, RRID:SCR_010876)
Finally, binding affinities and dissociation constants (Kd) were calculated using the PRODIGY webserver[34].	PRODIGY suggested: None

Results from OddPub: Thank you for sharing your code.

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
No funding statement was detected.
No protocol registration statement was detected.

Read the original source

Generalized linear models provide a measure of virulence for specific mutations in SARS-CoV-2 strains

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

Genomic and epidemiologic characteristics of SARS-CoV-2 persistent infections in California, January 2021 - July 2023

Integrative genomic study of mutation dynamics and Evolutionary trends in SARS-CoV-2 omicron BA.3

8266 SARS-CoV-2 Genomic Assemblies from Asymptomatic Carriers in Japan

This article has been Reviewed by the following groups

Listed in

Abstract

Article activity feed

Related articles

Genomic and epidemiologic characteristics of SARS-CoV-2 persistent infections in California, January 2021 - July 2023

Integrative genomic study of mutation dynamics and Evolutionary trends in SARS-CoV-2 omicron BA.3

8266 SARS-CoV-2 Genomic Assemblies from Asymptomatic Carriers in Japan