Temporal dynamics of SARS-CoV-2 mutation accumulation within and across infected hosts
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Analysis of SARS-CoV-2 genetic diversity within infected hosts can provide insight into the generation and spread of new viral variants and may enable high resolution inference of transmission chains. However, little is known about temporal aspects of SARS-CoV-2 intrahost diversity and the extent to which shared diversity reflects convergent evolution as opposed to transmission linkage. Here we use high depth of coverage sequencing to identify within-host genetic variants in 325 specimens from hospitalized COVID-19 patients and infected employees at a single medical center. We validated our variant calling by sequencing defined RNA mixtures and identified viral load as a critical factor in variant identification. By leveraging clinical metadata, we found that intrahost diversity is low and does not vary by time from symptom onset. This suggests that variants will only rarely rise to appreciable frequency prior to transmission. Although there was generally little shared variation across the sequenced cohort, we identified intrahost variants shared across individuals who were unlikely to be related by transmission. These variants did not precede a rise in frequency in global consensus genomes, suggesting that intrahost variants may have limited utility for predicting future lineages. These results provide important context for sequence-based inference in SARS-CoV-2 evolution and epidemiology.
Article activity feed
-
-
SciScore for 10.1101/2021.01.19.427330: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement IRB: These studies and the use of residual specimens were approved by the University of Michigan Institutional Review Board. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Analysis of sequence reads: We aligned reads to the MN908947.3 reference genome with BWA-MEM version 0.7.1539. BWA-MEMsuggested: (Sniffles, RRID:SCR_017619)We identified single nucleotide variants with iVar 1.2.1 using the following parameters: sample with viral load ≥ 103 copies/μL; sample with consensus genome length of ≥ 29000; sample with ≥ 80% of genome sites above … SciScore for 10.1101/2021.01.19.427330: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement IRB: These studies and the use of residual specimens were approved by the University of Michigan Institutional Review Board. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Analysis of sequence reads: We aligned reads to the MN908947.3 reference genome with BWA-MEM version 0.7.1539. BWA-MEMsuggested: (Sniffles, RRID:SCR_017619)We identified single nucleotide variants with iVar 1.2.1 using the following parameters: sample with viral load ≥ 103 copies/μL; sample with consensus genome length of ≥ 29000; sample with ≥ 80% of genome sites above 200x coverage; iSNV frequency threshold of 2%; read depth of ≥ 100 at iSNV sites; ≥ 10 reads with average Phred score of > 35 supporting a given iSNV; iVar p-value of < 0.0001. Phredsuggested: (Phred, RRID:SCR_001017)To generate a phylogenetic tree, we aligned consensus genomes with MUSCLE 3.8.31 and masked positions that are known to commonly exhibit homoplasies or sequencing errors41. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)We generated a maximum likelihood phylogeny with IQ-TREE, using a GTR model and 1000 ultrafast bootstrap replicates42,43. IQ-TREEsuggested: (IQ-TREE, RRID:SCR_017254)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
SciScore for 10.1101/2021.01.19.427330: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Analysis of sequence reads We aligned reads to the MN908947.3 reference genome with BWA-MEM version 0.7.1539. BWA-MEMsuggested: (Sniffles, RRID:SCR_017619)We identified single nucleotide variants with iVar 1.2.1 using the following parameters: sample with viral load ≥ 103 copies/μL; sample with consensus genome length of ≥ 29000; sample with ≥ 80% of genome … SciScore for 10.1101/2021.01.19.427330: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Analysis of sequence reads We aligned reads to the MN908947.3 reference genome with BWA-MEM version 0.7.1539. BWA-MEMsuggested: (Sniffles, RRID:SCR_017619)We identified single nucleotide variants with iVar 1.2.1 using the following parameters: sample with viral load ≥ 103 copies/μL; sample with consensus genome length of ≥ 29000; sample with ≥ 80% of genome sites above 200x coverage; iSNV frequency threshold of 2%; read depth of ≥ 100 at iSNV sites; ≥ 10 reads with average Phred score of > 35 supporting a given iSNV; iVar p-value of < 0.0001. Phredsuggested: (Phred, RRID:SCR_001017)To generate a phylogenetic tree, we aligned consensus genomes with MUSCLE 3.8.31 and masked positions that are known to commonly exhibit homoplasies or sequencing errors41. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)We generated a maximum likelihood phylogeny with IQ-TREE, using a GTR model and ultrafast bootstrap replicates42,43. IQ-TREEsuggested: (IQ-TREE, RRID:SCR_017254)Results from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.
-