Host–pathogen dynamics in longitudinal clinical specimens from patients with COVID-19
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
Rapid dissemination of SARS-CoV-2 sequencing data to public repositories has enabled widespread study of viral genomes, but studies of longitudinal specimens from infected persons are relatively limited. Analysis of longitudinal specimens enables understanding of how host immune pressures drive viral evolution in vivo. Here we performed sequencing of 49 longitudinal SARS-CoV-2-positive samples from 20 patients in Washington State collected between March and September of 2020. Viral loads declined over time with an average increase in RT-QPCR cycle threshold of 0.87 per day. We found that there was negligible change in SARS-CoV-2 consensus sequences over time, but identified a number of nonsynonymous variants at low frequencies across the genome. We observed enrichment for a relatively small number of these variants, all of which are now seen in consensus genomes across the globe at low prevalence. In one patient, we saw rapid emergence of various low-level deletion variants at the N-terminal domain of the spike glycoprotein, some of which have previously been shown to be associated with reduced neutralization potency from sera. In a subset of samples that were sequenced using metagenomic methods, differential gene expression analysis showed a downregulation of cytoskeletal genes that was consistent with a loss of ciliated epithelium during infection and recovery. We also identified co-occurrence of bacterial species in samples from multiple hospitalized individuals. These results demonstrate that the intrahost genetic composition of SARS-CoV-2 is dynamic during the course of COVID-19, and highlight the need for continued surveillance and deep sequencing of minor variants.
Article activity feed
-
-
SciScore for 10.1101/2021.04.27.21256149: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics Field Sample Permit: Sample collection and clinical testing for SARS-CoV-2: Specimens were obtained as part of clinical testing for SARS-CoV-2 ordered by local healthcare providers or collected at drive-through testing sites. Sex as a biological variable not detected. Randomization Samples with more than 10 million reads were randomly down-sampled to 10 million reads before analysis using the “sample” command in seqtk (https://github.com/lh3/seqtk). Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources Sequencing and bioinformatic analysis: Sequencing was attempted on all samples with a positive RT-PCR assay result that had a Ct ≤36 using … SciScore for 10.1101/2021.04.27.21256149: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics Field Sample Permit: Sample collection and clinical testing for SARS-CoV-2: Specimens were obtained as part of clinical testing for SARS-CoV-2 ordered by local healthcare providers or collected at drive-through testing sites. Sex as a biological variable not detected. Randomization Samples with more than 10 million reads were randomly down-sampled to 10 million reads before analysis using the “sample” command in seqtk (https://github.com/lh3/seqtk). Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources Sequencing and bioinformatic analysis: Sequencing was attempted on all samples with a positive RT-PCR assay result that had a Ct ≤36 using either a metagenomic approach described previously [13] via IDT probe-capture [14], or using Swift Biosciences’ Swiftsuggested: (Swift, RRID:SCR_013018)Consensus sequences from each individual were aligned with the reference sequence NC_045512 using MAFFT v7 [17] MAFFTsuggested: (MAFFT, RRID:SCR_011811)RNAseq analysis: Reads were adapter and quality trimmed with Trimmomatic v0.39 [19] using the call “leading 3 trailing 3 slidingwindow:4:15 minlen 20”, then pseudoaligned to the hg38-derived human transcriptome using Kallisto v0.46 [20]. Trimmomaticsuggested: (Trimmomatic, RRID:SCR_011848)Kallistosuggested: (kallisto, RRID:SCR_016582)Differential expression analysis using the Wald test was performed using DEseq2 [21] and deemed significant at a Benjamini-Hochberg adjusted p value < 0.1 DEseq2suggested: (DESeq2, RRID:SCR_015687)Statistical enrichment of Gene Ontology Biological Processes was performed on all significant genes using the R package clusterProfiler [22]. Ontology Biologicalsuggested: NoneclusterProfilersuggested: (clusterProfiler, RRID:SCR_016884)Raw counts have been submitted to the Gene Expression Omnibus, accession GSE173310. Gene Expression Omnibussuggested: (Gene Expression Omnibus (GEO, RRID:SCR_005012)Metagenomic analysis: Raw FASTQ files were analyzed using CLOMP v0.1.4 (https://github.com/FredHutch/CLOMP) as previously described [23]. CLOMPsuggested: NoneResults from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:As a result, one of the limitations of our study is the variability in sequencing depth across samples and the difficulty in ensuring similar sequencing depth for samples from different time points. We used an amplicon sequencing-based approach described previously [27] to obtain near full-length genomes from low viral load samples (up to Ct values of 36). We also used multiple library preparations and performed re-sequencing to ensure the accuracy of variant calls. Taken together, our results suggest that low frequency genomic variants emerge in immunocompetent individuals, but that these variants are unlikely to reach fixation. Given the emergence of rapidly spreading variants of concern over the past several months, the limited intra-host evolution observed in our dataset highlights the critical impact that a select few individual intra-host evolutionary events may have on the course of the global pandemic and the need for continual genomic surveillance.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-
