Shared within-host SARS-CoV-2 variation in households

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

The limited variation observed among SARS-CoV-2 consensus sequences makes it difficult to reconstruct transmission linkages in outbreak settings. Previous studies have recovered variation within individual SARS-CoV-2 infections but have not yet measured the informativeness of within-host variation for transmission inference.

Methods

We performed tiled amplicon sequencing on 307 SARS-CoV-2 samples from four prospective studies and combined sequence data with household membership data, a proxy for transmission linkage.

Results

Consensus sequences from households had limited diversity (mean pairwise distance, 3.06 SNPs; range, 0-40). Most (83.1%, 255/307) samples harbored at least one intrahost single nucleotide variant (iSNV; median: 117; IQR: 17-208), when applying a liberal minor allele frequency of 0.5% and prior to filtering. A mean of 15.4% of within-host iSNVs were recovered one day later. Pairs in the same household shared significantly more iSNVs (mean: 1.20 iSNVs; 95% CI: 1.02-1.39) than did pairs in different households infected with the same viral clade (mean: 0.31 iSNVs; 95% CI: 0.28-0.34), a signal that increases with increasingly liberal thresholds.

Conclusions

Although only a subset of within-host variation is consistently shared across likely transmission pairs, shared iSNVs may augment the information in consensus sequences for predicting transmission linkages.

Article activity feed

  1. SciScore for 10.1101/2022.05.26.22275279: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    EthicsConsent: All study participants provided written consent and all studies were approved by the Institutional Review Board of Stanford University (Numbers: 55479, 57686, 56032, and 55619).
    IRB: All study participants provided written consent and all studies were approved by the Institutional Review Board of Stanford University (Numbers: 55479, 57686, 56032, and 55619).
    Sex as a biological variablenot detected.
    Randomization(b) A randomized, single-blind, placebo-controlled trial of Peginterferon Lambda-1a (Lambda) for reducing the duration of viral shedding or symptoms[17] in which oropharyngeal swabs were collected for 28 days following enrollment.
    Blindingnot detected.
    Power Analysisnot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Sequence data is available at SRA (BioProject ID: PRJNA842503).
    BioProject
    suggested: (NCBI BioProject, RRID:SCR_004801)
    We also used this pipeline to remove reads mapping to the host genome with Kraken2[24], map reads with Bowtie 2[22], generate consensus sequences with bcftools[25], and assign Nextclade lineages[26].
    Kraken2
    suggested: None

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our study has several limitations. First, we focused on a convenience sample of residual samples with accompanying household information collected in California from March 2020 through May 2021. Replicating these findings in other settings and with more recently emerged SARS-CoV-2 lineages is critical to understand the generalizability of our findings. Second, our study focused on the potential epidemiological value of within-host viral variation. Our focus was on transmission linkage rather than in viral evolutionary dynamics or transmission bottlenecks, which might have different optimal variant identification approaches. Third, many groups have hypothesized that evolution within immune-compromised or immune-suppressed populations may be an important driver of the emergence of new variants of concern or interest[37–41]. Our sample collection did not enable us to test these hypotheses. Forth, the epidemiological utility of within-host variation depends on SARS-CoV-2 sampling and sequencing. Routine sequencing may always not generate sufficient depth to accurately recover within-host variation. In conclusion, we find that SARS-CoV-2 variation within individual hosts may be shared across transmission pairs and may contribute information on transmission linkage on a backdrop of limited diversity among consensus sequences. More broadly, pathogen diversity within individual infections holds largely untapped information that may enhance the resolution of transmission inferences.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a protocol registration statement.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.