Assessing the evolution of SARS-CoV-2 lineages and the dynamic associations between nucleotide variations

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Despite seminal advances towards understanding the infection mechanism of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2), it continues to cause significant morbidity and mortality worldwide. Though mass immunization programmes have been implemented in several countries, the viral transmission cycle has shown a continuous progression in the form of multiple waves. A constant change in the frequencies of dominant viral lineages, arising from the accumulation of nucleotide variations (NVs) through favourable selection, is understandably expected to be a major determinant of disease severity and possible vaccine escape. Indeed, worldwide efforts have been initiated to identify specific virus lineage(s) and/or NVs that may cause a severe clinical presentation or facilitate vaccination breakthrough. Since host genetics is expected to play a major role in shaping virus evolution, it is imperative to study the role of genome-wide SARS-CoV-2 NVs across various populations. In the current study, we analysed the whole genome sequence of 3543 SARS-CoV-2-infected samples obtained from the state of Telangana, India (including 210 from our previous study), collected over an extended period from April 2020 to October 2021. We present a unique perspective on the evolution of prevalent virus lineages and NVs during this period. We also highlight the presence of specific NVs likely to be associated favourably with samples classified as vaccination breakthroughs. Finally, we report genome-wide intra-host variations at novel genomic positions. The results presented here provide critical insights into virus evolution over an extended period and pave the way to rigorously investigate the role of specific NVs in vaccination breakthroughs.

Article activity feed

  1. SciScore for 10.1101/2022.01.19.22269572: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    EthicsIRB: The work was initiated following approvals from the Institutional Bioethics committee and Biosafety committee.
    Sex as a biological variableSample collection strategy, dataset structure and features: A total of 3543 samples (1407 females and 2091 males (information unavailable for 45 samples)), representing the period April 1, 2020 to October 31, 2021, and belonging to Telangana, India, were analysed in this study (Table S1A).
    RandomizationOdds ratio for estimating the association likelihoods of genomic alterations with vaccination breakthrough cases were estimated by creating contingency matrices for each NV identified in >5% of vaccinated samples and were compared with multiple random subsamples of non-vaccinated cases starting March 2021 onwards.
    Blindingnot detected.
    Power Analysisnot detected.
    Cell Line AuthenticationAuthentication: The synthesized cDNA was amplified using a multiplex polymerase chain reaction (PCR) protocol, producing 98 amplicons across the SARS-CoV-2 genome (https://artic.network/).

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    All reads shorter than 30 bases or with a Phred quality score <20, were discarded.
    Phred
    suggested: (Phred, RRID:SCR_001017)
    Reads were assembled to generate consensus fasta file using samtools mpileup and the consensus module of iVar with a base assigned as consensus if it had a minimum depth of at least 10 reads (setting ivarMinDepth=10).
    samtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    All structural representations were generated in PyMOL (The PyMOL Molecular Graphics System, Schrödinger, LLC).
    PyMOL
    suggested: (PyMOL, RRID:SCR_000305)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.