SARS-CoV-2 genome analysis of strains in Pakistan reveals GH, S and L clade strains at the start of the pandemic

This article has been Reviewed by the following groups

Read the full article

Abstract

Objectives

Pakistan has a high infectious disease burden with about 265,000 reported cases of COVID-19. We investigated the genomic diversity of SARS-CoV-2 strains and present the first data on viruses circulating in the country.

Methods

We performed whole-genome sequencing and data analysis of SARS-CoV-2 eleven strains isolated in March and May.

Results

Strains from travelers clustered with those from China, Saudi Arabia, India, USA and Australia. Five of eight SARS-CoV-2 strains were GH clade with Spike glycoprotein D614G, Ns3 gene Q57H, and RNA dependent RNA polymerase (RdRp) P4715L mutations. Two were S (ORF8 L84S and N S202N) and three were L clade and one was an I clade strain. One GH and one L strain each displayed Orf1ab L3606F indicating further evolutionary transitions.

Conclusions

This data reveals SARS-CoV-2 strains of L, G, S and I have been circulating in Pakistan from March, at the start of the pandemic. It indicates viral diversity regarding infection in this populous region. Continuing molecular genomic surveillance of SARS-CoV-2 in the context of disease severity will be important to understand virus transmission patterns and host related determinants of COVID-19 in Pakistan.

Article activity feed

  1. SciScore for 10.1101/2020.08.04.234153: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board StatementIACUC: Ethical approval: This study was approved by the Ethical Review Committee at the Aga Khan University (AKU), Karachi, Pakistan.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Diagnostic testing for SARS-CoV-2: Nasopharyngeal swab specimens were tested for SARS-CoV-2 by reverse transcription (RT) polymerase chain reaction (PCR) at the Section of Molecular Pathology, Clinical Laboratory,
    Clinical Laboratory
    suggested: None
    Variant calling and Phylogenetic analysis: FASTQ files were aligned to the SARS-CoV-2 virus reference genome Wuhan-1 (NC_045512.2) by BWA (13).
    BWA
    suggested: (BWA, RRID:SCR_010910)
    PICARD tools (http://broadinstitute.github.io/picard/) were used to remove redundant alignments and calculating alignment statistics.
    PICARD
    suggested: (Picard, RRID:SCR_006525)
    Variants were identified by Genome Analysis Toolkit (GATK) (14).
    Genome Analysis Toolkit
    suggested: None
    GATK
    suggested: (GATK, RRID:SCR_001876)
    The MSA was subsequently used to generate a Maximum Likelihood (ML) phylogenetic tree using PhyML 3.0 (http://www.atgc-montpellier.fr/phyml/) with a GTR-based nucleotide substitution model and aLRT SH-Like branch support.
    PhyML
    suggested: (PhyML, RRID:SCR_014629)
    The tree was visualized and edited in Figtree software (http://tree.bio.ed.ac.uk/software/figtree/).
    Figtree
    suggested: (FigTree, RRID:SCR_008515)
    The mean and individual pairwise distance between 7 SARS-CoV-2 sequences from our study and 3 previously deposited Pakistani SARS-CoV-2 sequences was calculated using MEGA 7 (18).
    MEGA
    suggested: (Mega BLAST, RRID:SCR_011920)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.