Spatio-temporal dynamics of intra-host variability in SARS-CoV-2 genomes

This article has been Reviewed by the following groups

Read the full article

Abstract

During the course of the COVID-19 pandemic, large-scale genome sequencing of SARS-CoV-2 has been useful in tracking its spread and in identifying variants of concern (VOC). Viral and host factors could contribute to variability within a host that can be captured in next-generation sequencing reads as intra-host single nucleotide variations (iSNVs). Analysing 1347 samples collected till June 2020, we recorded 16 410 iSNV sites throughout the SARS-CoV-2 genome. We found ∼42% of the iSNV sites to be reported as SNVs by 30 September 2020 in consensus sequences submitted to GISAID, which increased to ∼80% by 30th June 2021. Following this, analysis of another set of 1774 samples sequenced in India between November 2020 and May 2021 revealed that majority of the Delta (B.1.617.2) and Kappa (B.1.617.1) lineage-defining variations appeared as iSNVs before getting fixed in the population. Besides, mutations in RdRp as well as RNA-editing by APOBEC and ADAR deaminases seem to contribute to the differential prevalence of iSNVs in hosts. We also observe hyper-variability at functionally critical residues in Spike protein that could alter the antigenicity and may contribute to immune escape. Thus, tracking and functional annotation of iSNVs in ongoing genome surveillance programs could be important for early identification of potential variants of concern and actionable interventions.

Article activity feed

  1. SciScore for 10.1101/2020.12.09.417519: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    In order to trim the adaptor sequences and filter low-quality reads from downstream analysis, we used Trimmomatic (58).
    Trimmomatic
    suggested: (Trimmomatic, RRID:SCR_011848)
    All reads which couldn’t be aligned in pairs to the human reference genome were filtered out from the human aligned files using SAMtools (60) and mapped to the SARS-CoV-2 reference genome (NC_045512.2) using Burrows-Wheeler Aligner (BWA-MEM) (61).
    SAMtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    FastQC (64), QualiMap (65) and SAMtools (60) were used to obtain sample quality scores and alignment statistics at each step.
    FastQC
    suggested: (FastQC, RRID:SCR_014583)
    QualiMap
    suggested: (QualiMap, RRID:SCR_001209)
    MAFFT (66) was used to create multiple sequence alignments of consensus FASTA sequences of the processed samples and the downloaded FASTA sequences from GISAID.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    In order to categorize the specific amino acid change and the proteins containing the iSNVs, they were annotated using SnpEff version 4.5 (70).
    SnpEff
    suggested: (SnpEff, RRID:SCR_005191)
    We utilized PyMol (73) to determine interface residues from the available crystal structures.
    PyMol
    suggested: (PyMOL, RRID:SCR_000305)
    Data handling and visualizations: Custom python scripts were used to automate the process of downloading and retrieving samples from NCBI SRA as well as to process samples through the pipeline.
    python
    suggested: (IPython, RRID:SCR_001658)
    Python libraries such as NumPy (79), Pandas (80), Matplotlib (81) and Seaborn (82) were used for data handling and visualizations.
    NumPy
    suggested: (NumPy, RRID:SCR_008633)
    Matplotlib
    suggested: (MatPlotLib, RRID:SCR_008624)
    SciPy was used to compute pairwise two-sided t-tests and to calculate distribution statistics and Pearson’s correlation coefficients.
    SciPy
    suggested: (SciPy, RRID:SCR_008058)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    However, there are limitations to this analysis as the reports are few, the platforms were different, and these were not followup studies at regular intervals. These observations substantiate that editing within hosts may possibly lead to an evolved immune escape ability in some strains which may seem to be a case of reinfection in a host after weeks or months of the first incidence. In conclusion, temporally tracking within-host variability of the virus in individuals and populations might provide important leads to the sites that are favourable or deleterious for virus survival. This information would be of enormous utility for diagnostics, design of vaccines as well as predicting the spread and infectivity of viral strains in the population. Conjoint analysis with the host variability in editing machinery should be the next step.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 31. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.