Molecular Architecture of Early Dissemination and Evolution of the SARS-CoV-2 Virus in Metropolitan Houston, Texas
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
We sequenced the genomes of 320 SARS-CoV-2 strains from COVID-19 patients in metropolitan Houston, Texas, an ethnically diverse region with seven million residents. These genomes were from the viruses causing infections in the earliest recognized phase of the pandemic affecting Houston. Substantial viral genomic diversity was identified, which we interpret to mean that the virus was introduced into Houston many times independently by individuals who had traveled from different parts of the country and the world. The majority of viruses are apparent progeny of strains derived from Europe and Asia. We found no significant evidence of more virulent viral types, stressing the linkage between severe disease, underlying medical conditions, and perhaps host genetics. We discovered a signal of selection acting on the spike protein, the primary target of massive vaccine efforts worldwide. The data provide a critical resource for assessing virus evolution, the origin of new outbreaks, and the effect of host immune response.
Significance
COVID-19, the disease caused by the SARS-CoV-2 virus, is a global pandemic. To better understand the first phase of virus spread in metropolitan Houston, Texas, we sequenced the genomes of 320 SARS-CoV-2 strains recovered from COVID-19 patients early in the Houston viral arc. We identified no evidence that a particular strain or its progeny causes more severe disease, underscoring the connection between severe disease, underlying health conditions, and host genetics. Some amino acid replacements in the spike protein suggest positive immune selection is at work in shaping variation in this protein. Our analysis traces the early molecular architecture of SARS-CoV-2 in Houston, and will help us to understand the origin and trajectory of future infection spikes.
Article activity feed
-
SciScore for 10.1101/2020.05.01.072652: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Nucleotide sequence alignments for the combined Houston and GISAID strains were generated using MAFFT version 7.130b with default parameters (32). MAFFTsuggested: (MAFFT, RRID:SCR_011811)Sequences were manually curated in JalView (33) to trim the ends and to remove sequences containing spurious inserts. JalViewsuggested: (Jalview, RRID:SCR_006459)Phylogenetic trees were generated using FastTree with the generalized time-reversible model for nucleotide sequences (34). FastTreesuggest…SciScore for 10.1101/2020.05.01.072652: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Nucleotide sequence alignments for the combined Houston and GISAID strains were generated using MAFFT version 7.130b with default parameters (32). MAFFTsuggested: (MAFFT, RRID:SCR_011811)Sequences were manually curated in JalView (33) to trim the ends and to remove sequences containing spurious inserts. JalViewsuggested: (Jalview, RRID:SCR_006459)Phylogenetic trees were generated using FastTree with the generalized time-reversible model for nucleotide sequences (34). FastTreesuggested: (FastTree, RRID:SCR_015501)Analysis of the nsp12 polymerase and S protein genes: The nsp12 viral polymerase and S protein genes were analyzed by plotting SNP density in the consensus alignment using Python (Python v3.4.3, Biopython Package v1.72). Pythonsuggested: (IPython, RRID:SCR_001658)Biopythonsuggested: (Biopython, RRID:SCR_007173)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
