Genomic surveillance unfolds the dynamics of SARS-CoV-2 transmission and divergence in Bangladesh over the past two years

This article has been Reviewed by the following groups

Read the full article

Abstract

The highly pathogenic virus SARS-CoV-2 has shattered the healthcare system of the world causing the COVID-19 pandemic since first detected in Wuhan, China. Therefore, scrutinizing the genome structure and tracing the transmission of the virus has gained enormous interest in designing appropriate intervention strategies to control the pandemic. In this report, we examined 4622 sequences from Bangladesh and found that they belonged to thirty-five major PANGO lineages, while Delta alone accounted for 39%, and 78% were from just four primary lineages. Our research has also shown Dhaka to be the hub of viral transmission and observed the virus spreading back and forth across the country at different times by building a transmission network. The analysis resulted in 7659 unique mutations, with an average of 24.61 missense mutations per sequence. Moreover, our analysis of genetic diversity and mutation patterns revealed that eight genes were under negative selection pressure to purify deleterious mutations, while three genes were under positive selection pressure.

Importance

With 29,122 deaths, 1.95 million infections and a shattered healthcare system from SARS-CoV-2 in Bangladesh, the only way to avoid further complications is to break the transmission network of the virus. Therefore, it is vital to shedding light on the transmission, divergence, mutations, and emergence of new variants using genomic data analyses and surveillance. Here, we present the geographic and temporal distribution of different SARS-CoV-2 variants throughout Bangladesh over the past two years, and their current prevalence. Further, we have developed a transmission network of viral spreads, which in turn will help take intervention measures. Then we analyzed all the mutations that occurred and their effect on evolution as well as the currently present mutations that could trigger a new variant of concern. In short, together with an ongoing genomic surveillance program, these data will help to better understand SARS-CoV-2, its evolution, and pandemic characteristics in Bangladesh.

Article activity feed

  1. SciScore for 10.1101/2022.04.13.488264: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Transmission analysis: First, the selected sequences were aligned using the Mafft algorithm (21), followed by the construction of a maximum likelihood phylogenetic tree using IQ-TREE (22)and calibrating the tree based on time with TreeTime (23).
    IQ-TREE
    suggested: (IQ-TREE, RRID:SCR_017254)
    Mutation Analysis: We have aligned each sequence with the reference sequence (NC_045512.2) (8) using the minimap2 algorithm (25) and called the variants with Samtools (26).
    Mutation Analysis
    suggested: None
    Samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    Finally, SNPeff was used to predict the impact of the mutations (30).
    SNPeff
    suggested: (SnpEff, RRID:SCR_005191)
    Effects of mutation: First of all, we used TASSEL software (31) to determine the nucleotide diversity (π) using a 20 base-pair window at five base-pair steps.
    TASSEL
    suggested: (TASSEL, RRID:SCR_012837)
    Then we calculated the direction of selection in the sequences to know if diversity moves away from neutrality and to understand the pattern of evolution using the SLAC algorithm (32) in the HyPhy software package (33)
    HyPhy
    suggested: (HyPhy, RRID:SCR_016162)
    Linkage disequilibrium among mutations prevalent in 10% or more sequences were calculated using AutoVem (34) and presented by the R2 index using HaploView (35).
    HaploView
    suggested: (Haploview, RRID:SCR_003076)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our analysis has limitations at this point because we had a higher number of sequences from these two regions than others. The sequences were more diversified in the first phase of the pandemic. However, with the arrival of the Delta and Omicron variants, the divergence reduced drastically, maybe due to viral adaptation following the “Survival of the fittest’’ theory of natural selection, although we have found several sub-lineages of the Delta variant. Additionally, we have seen Dhaka being the viral transmission hub, which is obvious since it is the capital city of Bangladesh, but this city is not the only transmission source. From extensive analysis, we have built the SARS-CoV-2 transmission network between different administrative divisions and observed the back and forth transmission of the virus inside Bangladesh. This situation arose due to a lack of restriction on the mass movement; public gatherings were not limited duly, and other socioeconomic events. From the mutational perspective, we have seen a total of 7659 unique mutations present in 4622 sequences with 37.64 mutations per sample where on average 24.61were coding variants, which happens to be significantly higher than the global average of 7.23, reported in July 2020 (38). This sharp rise of mutations indicates the SARS-CoV-2 might be facing strong challenges from the host’s immunologic response in addition to random regular mutational events of RNA viruses, which is one of the reasons for the emergence of ne...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.