The genetic variant analyses of SARS-CoV-2 strains; circulating in Bangladesh

This article has been Reviewed by the following groups

Read the full article

Abstract

Genomic mutation of the virus may impact the viral adaptation to the local environment, their transmission, disease manifestation, and the effectiveness of existing treatment and vaccination. The objectives of this study were to characterize genomic variations, non-synonymous amino acid substitutions, especially in target proteins, mutation events per samples, mutation rate, and overall scenario of coronaviruses across the country. To investigate the genetic diversity, a total of 184 genomes of virus strains sampled from different divisions of Bangladesh with sampling dates between the 10th of May 2020 and the 27 th of June 2020 were analyzed. To date, a total of 634 mutations located along the entire genome resulting in non-synonymous 274 amino acid substitutions in 22 different proteins were detected with nucleotide mutation rate estimated to be 23.715 substitutions per year. The highest non-synonymous amino acid substitutions were observed at 48 different positions of the papain-like protease (nsp3). Although no mutations were found in nsp7, nsp9, nsp10, and nsp11, yet orf1ab accounts for 56% of total mutations. Among the structural proteins, the highest non-synonymous amino acid substitution (at 36 positions) observed in spike proteins, in which 9 unique locations were detected relative to the global strains, including 516E>Q in the boundary of the ACE2 binding region. The most dominated variant G614 (95%) based in spike protein is circulating across the country with co-evolving other variants including L323 (94%) in RNA dependent RNA polymerase (RdRp), K203 (82%) and R204 (82%) in nucleocapsid, and F120 (78%) in NSP2. These variants are mostly seen as linked mutations and are part of a haplotype observed in Europe. Data suggest effective containment of clade G strains (4.8%) with sub-clusters GR 82.4%, and GH clade 6.4%.

Highlights

  • We have sequenced 137 and analyzed 184 whole-genomes sequences of SARS-CoV-2 strains from different divisions of Bangladesh.

  • A total of 634 mutation sites across the SARS-CoV-2 genome and 274 non-synonymous amino acid substitutions were detected.

  • The mutation rate of SARS-CoV-2 estimated to be 23.715 nucleotide substitutions per year.

  • Nine unique variants were detected based on non-anonymous amino acid substitutions in spike protein relative to the global SARS-CoV-2 strains.

  • Article activity feed

    1. SciScore for 10.1101/2020.07.29.226555: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      Institutional Review Board Statementnot detected.
      Randomizationnot detected.
      Blindingnot detected.
      Power Analysisnot detected.
      Sex as a biological variablenot detected.

      Table 2: Resources

      Software and Algorithms
      SentencesResources
      Sequence alignment for constructed 184 datasets was performed using Multiple Sequence Comparison by Log-Expectation (MUSCLE) software and MEGAX (Kumar et al., 2018).
      MUSCLE
      suggested: (MUSCLE, RRID:SCR_011812)
      Aligned sequences were separately displayed in the AliView to verify that the sequences were in the frame.
      AliView
      suggested: (AliView, RRID:SCR_002780)

      Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


      Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

      Results from TrialIdentifier: No clinical trial numbers were referenced.


      Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


      Results from JetFighter: We did not find any issues relating to colormaps.


      Results from rtransparent:
      • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
      • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
      • No protocol registration statement was detected.

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.

    2. SciScore for 10.1101/2020.07.29.226555: (What is this?)

      Please note, not all rigor criteria are appropriate for all manuscripts.

      Table 1: Rigor

      Institutional Review Board Statementnot detected.Randomizationnot detected.Blindingnot detected.Power Analysisnot detected.Sex as a biological variableViral isolates were collected from 64% of male patients and 36% of females (Fig. 1b).

      Table 2: Resources

      Software and Algorithms
      SentencesResources
      Sequence alignment for constructed 184 datasets was performed using Multiple Sequence Comparison by LogExpectation (MUSCLE) software and MEGAX (Kumar et al., 2018)
      MUSCLE
      suggested: (MUSCLE, SCR_011812)
      Aligned sequences were separately displayed in the AliView to verify that the sequences were in the frame.
      AliView
      suggested: (AliView, SCR_002780)

      Data from additional tools added to each annotation on a weekly basis.

      About SciScore

      SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.