Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations

Abstract

We performed a meta-analysis on SARS-CoV-2 genomes categorized by collection month and identified several significant mutations. Pearson correlation analysis of these significant mutations identified 16 comutations having absolute correlation coefficients of >0.4 and a frequency of >30% in the genomes used in this study.

SciScore for 10.1101/2022.04.05.487114: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
All the SARS-CoV-2 genomic sequences were collected in a month-wise manner (based on the sample collection month) from the Virus Pathogen Resource (ViPR) database [18].	ViPR suggested: (vipR, RRID:SCR_010685)
We created an empty matrix with 30000 columns and 55759 rows using the NumPy module of Python.	NumPy suggested: (NumPy, RRID:SCR_008633)
To make the visualization more effective, we represent the dendrogram with a heatmap using the pdist and squareform method of scipy library.	scipy suggested: (SciPy, RRID:SCR_008058)
All these analyses were performed in python.	python suggested: (IPython, …

SciScore for 10.1101/2022.04.05.487114: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
All the SARS-CoV-2 genomic sequences were collected in a month-wise manner (based on the sample collection month) from the Virus Pathogen Resource (ViPR) database [18].	ViPR suggested: (vipR, RRID:SCR_010685)
We created an empty matrix with 30000 columns and 55759 rows using the NumPy module of Python.	NumPy suggested: (NumPy, RRID:SCR_008633)
To make the visualization more effective, we represent the dendrogram with a heatmap using the pdist and squareform method of scipy library.	scipy suggested: (SciPy, RRID:SCR_008058)
All these analyses were performed in python.	python suggested: (IPython, RRID:SCR_001658)
Functional impacts of mutations: To investigate the effect of mutation on protein function, we used the widely popular PredictSNP web server [22] available at https://loschmidt.chemi.muni.cz/predictsnp/.	PredictSNP suggested: (PredictSNP, RRID:SCR_006327)
This web tool is composed of six different predictors, PhD-SNP, MAPP, SNAP, PolyPhen-1,	SNAP suggested: (SNAP, RRID:SCR_007936)
SIFT and PolyPhen-2 to predict whether mutation is deleterious or neutral.	SIFT suggested: (SIFT, RRID:SCR_012813)
PhD-SNP, MAPP, SNAP, PolyPhen-1,	PhD-SNP suggested: (PhD-SNP, RRID:SCR_010782)
, SIFT and PolyPhen-2 apply support vector machine, physicochemical characteristics and protein sequence alignment score, neural network approach, expert set of empirical rules, protein sequence alignment score and naïve Bayes respectively [22].	PolyPhen-2 suggested: None
In addition to its own ΔΔG prediction, DynaMut also predicts ΔΔG using NMA based ENCoM (Elastic network contact model) [24] and, other structure-based predictors like mCSM [25], SDM [26] and DUET [27].	mCSM suggested: (mCSM, RRID:SCR_010776)
To calculate the nLMI of WT and MT proteins, we employed Python-based correlationplus 0.2.1 tool [33].	Python-based suggested: None

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
No funding statement was detected.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Time Series Analysis of SARS-CoV-2 Genomes and Correlations among Highly Prevalent Mutations

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA