One viral sequence for each host? – The neglected within-host diversity as the main stage of SARS-CoV-2 evolution

This article has been Reviewed by the following groups

Read the full article

Abstract

The standard practice of presenting one viral sequence for each infected individual implicitly assumes low within-host genetic diversity. It places the emphasis on the viral evolution between, rather than within, hosts. To determine this diversity, we collect SARS-CoV-2 samples from the same patient multiple times. Our own data in conjunction with previous reports show that two viral samples collected from the same individual are often very different due to the substantial within-host diversity. Each sample captures only a small part of the total diversity that is transiently and locally released from infected cells. Hence, the global SARS-CoV-2 population is a meta-population consisting of the viruses in all the infected hosts, each of which harboring a genetically diverse sub-population. Advantageous mutations must be present first as the within-host diversity before they are revealed as between-host polymorphism. The early detection of such diversity in multiple hosts could be an alarm for potentially dangerous mutations. In conclusion, the main forces of viral evolution, i.e., mutation, drift, recombination and selection, all operate within hosts and should be studied accordingly. Several significant implications are discussed.

Article activity feed

  1. SciScore for 10.1101/2021.06.21.449205: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The final viral-enriched libraries were sequenced on an Illumina NextSeq500 in 2×75bp pair-end mode. iSNVs calling: (1) After quality control, sequencing reads were pair-ended aligned to the reference genome sequence (GenBank accession no. NC 045512.2 (Wu et al. 2020)) using Bowtie2 v2.1.0 (Langmead and Salzberg 2012) by default parameters and the alignments were reformatted using SAMtools v1.3.1 (Li et al. 2009); (2) for each site of the SARS-CoV-2 genome, the aligned low-quality bases and indels were excluded to reduce possible false positives and the site depth and strand bias were recalculated; (3) samples with more than 3,000 sites with a sequencing depth ≥100× were selected as candidate samples for iSNVs calling.
    Bowtie2
    suggested: (Bowtie 2, RRID:SCR_016368)
    SAMtools
    suggested: (SAMTOOLS, RRID:SCR_002105)
    After calling variants, we used ANNOVAR software (Yang and Wang 2015) to annotate the variants and found the count of alternative allele and total depth for each variant using SAMtools (see Data file S1).
    ANNOVAR
    suggested: (ANNOVAR, RRID:SCR_012821)
    We used python scripts to merge the frequency of iSNVs of these 138 samples (see Data file S2).
    python
    suggested: (IPython, RRID:SCR_001658)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.