Genomic epidemiology of the Los Angeles COVID-19 outbreak and the early history of the B.1.43 strain in the USA

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Background

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused global disruption of human health and activity. Being able to trace the early outbreak of SARS-CoV-2 within a locality can inform public health measures and provide insights to contain or prevent viral transmission. Investigation of the transmission history requires efficient sequencing methods and analytic strategies, which can be generally useful in the study of viral outbreaks.

Methods

The County of Los Angeles (hereafter, LA County) sustained a large outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). To learn about the transmission history, we carried out surveillance viral genome sequencing to determine 142 viral genomes from unique patients seeking care at the University of California, Los Angeles (UCLA) Health System. 86 of these genomes were from samples collected before April 19, 2020.

Results

We found that the early outbreak in LA County, as in other international air travel hubs, was seeded by multiple introductions of strains from Asia and Europe. We identified a USA-specific strain, B.1.43, which was found predominantly in California and Washington State. While samples from LA County carried the ancestral B.1.43 genome, viral genomes from neighboring counties in California and from counties in Washington State carried additional mutations, suggesting a potential origin of B.1.43 in Southern California. We quantified the transmission rate of SARS-CoV-2 over time, and found evidence that the public health measures put in place in LA County to control the virus were effective at preventing transmission, but might have been undermined by the many introductions of SARS-CoV-2 into the region.

Conclusion

Our work demonstrates that genome sequencing can be a powerful tool for investigating outbreaks and informing the public health response. Our results reinforce the critical need for the USA to have coordinated inter-state responses to the pandemic.

Article activity feed

  1. SciScore for 10.1101/2020.09.15.20194712: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    We used bcl2fastq (v2.20.0.422) to obtain libraries for each sample allowing a 1 barcode mismatch for NEB Ultra II samples, and 0 a barcode mismatch for V-seq libraries.
    bcl2fastq
    suggested: (bcl2fastq , RRID:SCR_015058)
    This script uses the ShortRead (v1.46)(Morgan et al., 2009) package from Bioconductor (Huber et al., 2015).
    ShortRead
    suggested: (ShortRead, RRID:SCR_006813)
    Bioconductor
    suggested: (Bioconductor, RRID:SCR_006442)
    For the NEB libraries, PCR duplicates were removed using MarkDuplicates from the Picard tool suite (v2.22.2; (http://broadinstitute.github.io/picard).
    Picard
    suggested: (Picard, RRID:SCR_006525)
    We visualized the relationship of these metrics to the cycling threshold (Ct) of the RT-qPCR used to detect the presence of SARS-CoV-2 in each patient sample using ggplot2 (v3.3)(Wickham, 2016)
    ggplot2
    suggested: (ggplot2, RRID:SCR_014601)
    Phylodynamics analysis: To investigate how the transmission of SARS-CoV-2 changed overtime in LA County, we used a Bayesian birth-death skyline model implemented in BEAST (v2.5)(Stadler et al., 2013).
    BEAST
    suggested: (BEAST, RRID:SCR_010228)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    A limitation of our study is that we partially relied on publicly available genome sequences for our inferences. Publicly available genomes are sampled at different rates throughout the US and the world. As an example, the lack of B.1.43 lineages outside Washington State and California could reflect a lack of sequencing data from other states. In agreement with another study of LA County genomes (Zhang et al., 2020), we found evidence that SARS-CoV-2 has been introduced many times. However, without detailed travel information, we cannot pinpoint the sources of these introductions or rule out community transmission post-introduction. Finally, the UCLA patient population is affluent relative to all of LA County, and likely to travel more frequently, which suggests that our analysis may overestimate the relative importance of introductions to the overall dynamics of the SARS-CoV-2 outbreak in LA County. Early in the pandemic, LA County officials followed the advice of public health experts. Schools, bars, and gyms were closed on March 16, 2020, and all non-essential business activity was stopped on March 20, 2020. However, even after these orders were put in place, the number of reported daily cases continued to increase, with an average of ~850 cases per day in April and May (LA Times’ independent count; https://github.com/datadesk/california-coronavirus-data). We analyzed the rate of transmission of SARS-CoV-2 in LA County using the genome sequences, and found evidence that th...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.