Intra-host variation and evolutionary dynamics of SARS-CoV-2 populations in COVID-19 patients
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
Since early February 2021, the causative agent of COVID-19, SARS-CoV-2, has infected over 104 million people with more than 2 million deaths according to official reports. The key to understanding the biology and virus-host interactions of SARS-CoV-2 requires the knowledge of mutation and evolution of this virus at both inter- and intra-host levels. However, despite quite a few polymorphic sites identified among SARS-CoV-2 populations, intra-host variant spectra and their evolutionary dynamics remain mostly unknown.
Methods
Using high-throughput sequencing of metatranscriptomic and hybrid captured libraries, we characterized consensus genomes and intra-host single nucleotide variations (iSNVs) of serial samples collected from eight patients with COVID-19. The distribution of iSNVs along the SARS-CoV-2 genome was analyzed and co-occurring iSNVs among COVID-19 patients were identified. We also compared the evolutionary dynamics of SARS-CoV-2 population in the respiratory tract (RT) and gastrointestinal tract (GIT).
Results
The 32 consensus genomes revealed the co-existence of different genotypes within the same patient. We further identified 40 intra-host single nucleotide variants (iSNVs). Most (30/40) iSNVs presented in a single patient, while ten iSNVs were found in at least two patients or identical to consensus variants. Comparing allele frequencies of the iSNVs revealed a clear genetic differentiation between intra-host populations from the respiratory tract (RT) and gastrointestinal tract (GIT), mostly driven by bottleneck events during intra-host migrations. Compared to RT populations, the GIT populations showed a better maintenance and rapid development of viral genetic diversity following the suspected intra-host bottlenecks.
Conclusions
Our findings here illustrate the intra-host bottlenecks and evolutionary dynamics of SARS-CoV-2 in different anatomic sites and may provide new insights to understand the virus-host interactions of coronaviruses and other RNA viruses.
Article activity feed
-
-
SciScore for 10.1101/2020.05.20.103549: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Full-length consensus genomes were generated from reads mapped to the reference genome (GISAID accession: EPI_ISL_402119) using Pilon (v. 1.23)16. Pilonsuggested: (Pilon , RRID:SCR_014731)The collected coronaviridae-like reads were also de novo assembled using SPAdes (v. 3.14.0) with default settings17 with a maximum of 100-fold coverage of read data. SPAdessuggested: (SPAdes, RRID:SCR_000131)Nucleotide differences between the consensus sequences and the reference genome were summarized into artificial Variant Call Format (VCF) files, which were annotated using SnpEff (v.2.0.5)18 with … SciScore for 10.1101/2020.05.20.103549: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Full-length consensus genomes were generated from reads mapped to the reference genome (GISAID accession: EPI_ISL_402119) using Pilon (v. 1.23)16. Pilonsuggested: (Pilon , RRID:SCR_014731)The collected coronaviridae-like reads were also de novo assembled using SPAdes (v. 3.14.0) with default settings17 with a maximum of 100-fold coverage of read data. SPAdessuggested: (SPAdes, RRID:SCR_000131)Nucleotide differences between the consensus sequences and the reference genome were summarized into artificial Variant Call Format (VCF) files, which were annotated using SnpEff (v.2.0.5)18 with default settings. SnpEffsuggested: (SnpEff, RRID:SCR_005191)The assembled SARS-CoV-2 and selected representative genomes were aligned using MAFFT with default settings. MAFFTsuggested: (MAFFT, RRID:SCR_011811)A maximum likelihood (ML) tree was inferred using the software IQ-TREE (v.1.6.12)19, with the best fit nucleotide substitution model selected by ModelFinder from the same software. IQ-TREEsuggested: (IQ-TREE, RRID:SCR_017254)The linkage disequilibrium among the identified consensus variants were estimated using VCFtools (v.0.1.16). VCFtoolssuggested: (VCFtools, RRID:SCR_001235)First, paired-end metatranscriptomic reads were mapped to the reference genome (GISAID accession: EPI_ISL_402119) using BWA aln (v.0.7.16) with default parameters22. BWAsuggested: (BWA, RRID:SCR_010910)Duplicated reads were marked using Picard MarkDuplicates (v. 2.10.10) (http://broadinstitute.github.io/picard) with default settings. Picardsuggested: (Picard, RRID:SCR_006525)A heatmap was generated to visualize the AAFs for all samples using the pheatmap package in R (v.3.6.1). pheatmapsuggested: (pheatmap, RRID:SCR_016418)Statistics of iSNVs: The distribution of iSNVs among genetic components and patients were summarized and visualized using the Python package matplotlib (v.3.2.1) Pythonsuggested: (IPython, RRID:SCR_001658)matplotlibsuggested: (MatPlotLib, RRID:SCR_008624)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-