Molecular characterization of SARS-CoV-2 from Bangladesh: Implications in genetic diversity, possible origin of the virus, and functional significance of the mutations
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
In a try to understand the pathogenesis, evolution and epidemiology of the SARS-CoV-2 virus, scientists from all over the world are tracking its genomic changes in real-time. Genomic studies can be helpful in understanding the disease dynamics. We have downloaded 324 complete and near complete SARS-CoV-2 genomes submitted in GISAID database from Bangladesh which were isolated between 30 March to 7 September, 2020. We then compared these genomes with Wuhan reference sequence and found 4160 mutation events including 2253 missense single nucleotide variations, 38 deletions and 10 insertions. The C>T nucleotide change was most prevalent (41% of all muations) possibly due to selective mutation pressure to reduce CpG sites to evade CpG targeted host immune response. The most frequent mutation that occurred in 98% isolates was 3037C>T which is a synonymous change that almost always accompanied 3 other mutations that include 241C>T, 14408C>T (P323L in RdRp) and 23403A>G (D614G in spike protein). The P323L was reported to increase mutation rate and D614G is associated with increased viral replication and currently most prevalent variant circulating all over the world. We identified multiple missense mutations in B-cell and T-cell predicted epitope regions and/or PCR target regions (including R203K and G204R that occurred in 86% of the isolates) that may impact immunogenicity and/or RT-PCR based diagnosis. Our analysis revealed 5 large deletion events in ORF7a and ORF8 gene products that may be associated with less severity of the disease and increased viral clearance. Our phylogeny analysis identified most of the isolates belonged to the Nextstrain clade 20B (86%) and GISAID clade GR (88%). Most of our isolates shared common ancestors either directly with European countries or jointly with middle eastern countries as well as Australia and India. Interestingly, the 19B clade (GISAID S clade) was unique to Chittagong which was originally prevalent in China. This reveals possible multiple introduction of the virus in Bangladesh via different routes. Hence more genome sequencing and analysis with related clinical data is needed to interpret functional significance and better predict the disease dynamics that may be helpful for policy makers to control the COVID-19 pandemic in Bangladesh.
Article activity feed
-
-
SciScore for 10.1101/2020.10.12.336099: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Mutation analysis: We have used Genome Detective Coronavirus Typing Tool version 1.13 and CoVsurver enabled by GISAID to analyse our query sequences in FASTA format (16, 17). CoVsurversuggested: NoneFor functional prediction of mutational changes, we have used two web based tools namely SIFT (Sorting Intolerant From Tolerant) and MutPred2 (18, 19). SIFTsuggested: (SIFT, RRID:SCR_012813)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers …
SciScore for 10.1101/2020.10.12.336099: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Mutation analysis: We have used Genome Detective Coronavirus Typing Tool version 1.13 and CoVsurver enabled by GISAID to analyse our query sequences in FASTA format (16, 17). CoVsurversuggested: NoneFor functional prediction of mutational changes, we have used two web based tools namely SIFT (Sorting Intolerant From Tolerant) and MutPred2 (18, 19). SIFTsuggested: (SIFT, RRID:SCR_012813)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:As a limitation of our study, we couldn’t derive any clinical information of the patients from whom the samples were collected. The functional significance described in this paper are only computational prediction based and may not always reflect clinical scenario. Also, the genomic sequences were derived using different sequencing platforms (i.e. Illumina, Ion Torrent etc.) and methods (Sanger and Next-generation sequencing) by different laboratories which may have impacted the quality of the sequences hence impacted our analysis. We have found one sequenced that has no mutation compared to the reference sequence which is very unlikely and may possibly be a submission error as the sample was collected long after the original Wuhan outbreak. We hope our findings will create scopes for further research specially including clinical data and also help identifying changes in pathogenicity and infectivity pattern of the virus.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-