Dominant clade‐featured SARS‐CoV‐2 co‐occurring mutations reveal plausible epistasis: An in silico based hypothetical model
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) has evolved into eight fundamental clades with four of these clades (G, GH, GR, and GV) globally prevalent in 2020. To explain plausible epistatic effects of the signature co‐occurring mutations of these circulating clades on viral replication and transmission fitness, we proposed a hypothetical model using in silico approach. Molecular docking and dynamics analyses showed the higher infectiousness of a spike mutant through more favorable binding of G 614 with the elastase‐2. RdRp mutation p.P323L significantly increased genome‐wide mutations ( p < 0.0001), allowing for more flexible RdRp (mutated)‐NSP8 interaction that may accelerate replication. Superior RNA stability and structural variation at NSP3:C241T might impact protein, RNA interactions, or both. Another silent 5′‐UTR:C241T mutation might affect translational efficiency and viral packaging. These four G‐clade‐featured co‐occurring mutations might increase viral replication. Sentinel GH‐clade ORF3a:p.Q57H variants constricted the ion‐channel through intertransmembrane–domain interaction of cysteine(C81)‐histidine(H57). The GR‐clade N:p.RG203‐204KR would stabilize RNA interaction by a more flexible and hypo‐phosphorylated SR‐rich region. GV‐clade viruses seemingly gained the evolutionary advantage of the confounding factors; nevertheless, N:p.A220V might modulate RNA binding with no phenotypic effect. Our hypothetical model needs further retrospective and prospective studies to understand detailed molecular events and their relationship to the fitness of SARS‐CoV‐2.
Article activity feed
-
-
SciScore for 10.1101/2021.02.21.21252137: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources 2.1 Retrieval of Sequences and Mutation Analyses: This study analyzed 225,526 high-coverage (<1% Ns and <0.05% unique amino acid mutations) and complete (>29,000 nucleotide) genome sequences with specified collection date from a total of 3,16,166 sequences submitted to GISAID until January 03, 2021. Mutation Analysessuggested: NoneThe frequency of mutations was tested for significance with the Wilcoxon signed-rank test between RdRp ‘C’ variant and ‘T’ variant using IBM SPSS statistics 25. SPSSsuggested: (SPSS, RRID:SCR_002865)Random effect poisson regression model was performed in STATA … SciScore for 10.1101/2021.02.21.21252137: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources 2.1 Retrieval of Sequences and Mutation Analyses: This study analyzed 225,526 high-coverage (<1% Ns and <0.05% unique amino acid mutations) and complete (>29,000 nucleotide) genome sequences with specified collection date from a total of 3,16,166 sequences submitted to GISAID until January 03, 2021. Mutation Analysessuggested: NoneThe frequency of mutations was tested for significance with the Wilcoxon signed-rank test between RdRp ‘C’ variant and ‘T’ variant using IBM SPSS statistics 25. SPSSsuggested: (SPSS, RRID:SCR_002865)Random effect poisson regression model was performed in STATA v13.0 to identify the association between death-case ratio and different clade strains (G, GH, GR, and GV); both unadjusted and adjusted incidence risk ratio (IRR) were estimated where time was introduced as a panel variable 28. 2.2 Epidemiological Data Analysis and Time Plot Generation: In this study, we report the prevalence of these dominant clades in 2020, both individually and in combination, with disease progression and deaths allowing us to infer increasing fitness of the SARS-CoV-2. STATAsuggested: (Stata, RRID:SCR_012763)The studied sequences were divided into six regions; Europe (n= 145,254), Americas (n= 48,014), Eastern Mediterranean (n=3,103), Southeast Asia (n= 4,134), West Pacific (n= 16,974), and Africa (n= 2,740). 2.3 Stability, Secondary and Three-Dimensional Structure Prediction Analyses of S, RdRp, ORF3a, and N Proteins: DynaMut 31 and FoldX 5.0 32, 33 were used to determine the stability of both wild and mutant variants of N, RdRp, S, and ORF3a proteins. FoldXsuggested: (FoldX, RRID:SCR_008522)PredictProtein 34 was utilized for analyzing and predicting the possible secondary structure and solvent accessibility of both wild and mutant variants of those proteins. PredictProteinsuggested: NoneModeller v9.25 36 was also used to generate the structures against the same templates. Modellersuggested: (MODELLER, RRID:SCR_008395)The built-in structural assessment tools (Ramachandran plot, MolProbity, and Quality estimate) of SWISS-MODEL were used to check the quality of generated structures. MolProbitysuggested: (MolProbity, RRID:SCR_014226)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Like other SARS-CoV-2 studies 59–61, this statistical analysis also suffers from some limitations in dealing with genomic and calculating death-case ratio data. The death-case ratio is believed to be underestimated because of the inadequate number of tests capacity and asymptomatic SARS-CoV-2 cases in the general population. Moreover, fewer mutation patterns are uploaded from underdeveloped or developing countries like African and Sub- Saharan countries which might lead to spatial biasedness in the analysis. Therefore, the global epidemiological scenario of different clades was explored to mitigate this problem. Regional monthly data depicted a similar increase of GR and GV strains while the death ratio was decreasing in studied regions with some rare exceptions. In Europe, GR strains were predominant from April to August, while GV strains became predominant from September. In the Americas a high abundance (over 50%) of GH clade strains was found from March to December. In Eastern Mediterranean, Africa and Southeast Asian region there was an increased rise of GR strains in May, while in Western Pacific region GR strains became predominant in June. The increase of G strains and the death-case ratio at the same time was observed in most regions, except Eastern Mediterranean and Southeast Asia, where a limited number of sequence data were produced. (Figure 2b). Researchers around the globe are now trying to explore the factors associated with variable mortality rates due to the ...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on pages 44 and 47. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-