In depth analysis of Cyprus-specific mutations of SARS-CoV-2 strains using computational approaches
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
This study aims to characterize SARS-CoV-2 mutations which are primarily prevalent in the Cypriot population. Moreover, using computational approaches, we assess whether these mutations are associated with changes in viral virulence.
Methods
We utilize genetic data from 144 sequences of SARS-CoV-2 strains from the Cypriot population obtained between March 2020 and January 2021, as well as all data available from GISAID. We combine this with countries’ regional information, such as deaths and cases per million, as well as COVID-19-related public health austerity measure response times. Initial indications of selective advantage of Cyprus-specific mutations are obtained by mutation tracking analysis. This entails calculating specific mutation frequencies within the Cypriot population and comparing these with their prevalence world-wide throughout the course of the pandemic. We further make use of linear regression models to extrapolate additional information that may be missed through standard statistical analysis.
Results
We report a single mutation found in the ORF1ab gene (nucleotide position 18,440) that appears to be significantly enriched within the Cypriot population. The amino acid change is denoted as S6059F, which maps to the SARS-CoV-2 NSP14 protein. We further analyse this mutation using regression models to investigate possible associations with increased deaths and cases per million. Moreover, protein structure prediction tools show that the mutation infers a conformational change to the protein that significantly alters its structure when compared to the reference protein.
Conclusions
Investigating Cyprus-specific mutations for SARS-CoV-2 can lead to a better understanding of viral pathogenicity. Researching these mutations can generate potential links between viral-specific mutations and the unique genomics of the Cypriot population. This can not only lead to important findings from which to battle the pandemic on a national level, but also provide insights into viral virulence worldwide.
Article activity feed
-
-
-
SciScore for 10.1101/2021.06.08.447477: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Raw data analysis: The Burrows-Wheeler Aligner (BWA) [11], version: 0.7.15 was used to map the raw reads to Wuhan-Hu-1 (NCBI ID:NC_045512.2) BWAsuggested: (BWA, RRID:SCR_010910)Duplicate reads, which are likely to be the results of PCR bias, were marked using Picard (http://broadinstitute.github.io/picard/) Picardsuggested: (Picard, RRID:SCR_006525)2.6.0. SAMtools [12], version: 0.1.19, was used for additional BAM/SAM file manipulations. SAMtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Finally, the GATK FastaAlternateReferenceMaker method was used for consensus sequence extraction from the vcf … SciScore for 10.1101/2021.06.08.447477: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Raw data analysis: The Burrows-Wheeler Aligner (BWA) [11], version: 0.7.15 was used to map the raw reads to Wuhan-Hu-1 (NCBI ID:NC_045512.2) BWAsuggested: (BWA, RRID:SCR_010910)Duplicate reads, which are likely to be the results of PCR bias, were marked using Picard (http://broadinstitute.github.io/picard/) Picardsuggested: (Picard, RRID:SCR_006525)2.6.0. SAMtools [12], version: 0.1.19, was used for additional BAM/SAM file manipulations. SAMtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Finally, the GATK FastaAlternateReferenceMaker method was used for consensus sequence extraction from the vcf files. GATKsuggested: (GATK, RRID:SCR_001876)MAFFT [17] was used to construct a multiple sequence alignment (MSA). MAFFTsuggested: (MAFFT, RRID:SCR_011811)Phylogeny was estimated using the RAxML [18] maximum likelihood algorithm for phylogenetic tree construction. RAxMLsuggested: (RAxML, RRID:SCR_006086)Analysis was performed using R (packages: dplyr, tidyr, ggplot2, ggtree, phytools, phangorn). ggplot2suggested: (ggplot2, RRID:SCR_014601)I-TASSER was selected for protein structure modelling, since it outperformed other servers according to results from the 14th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP14) (https://zhanglab.ccmb.med.umich.edu/casp14/, last accessed 23/03/2021). I-TASSERsuggested: (I-TASSER, RRID:SCR_014627)The DynaMut webserver [21], was used to visualize non-covalent molecular interactions, calculated by the Arpeggio algorithm [22] Arpeggiosuggested: (Arpeggio, RRID:SCR_010876)Structural alignment was performed using the align tool of PyMOL and all-atom RMSD values were calculated without any outliers’ rejection, with zero cycles of refinement. PyMOLsuggested: (PyMOL, RRID:SCR_000305)Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-