Large-scale sequencing of SARS-CoV-2 genomes from one region allows detailed epidemiology and enables local outbreak management

Andrew J. Page
Alison E. Mather
Thanh Le-Viet
Emma J. Meader
Nabil-Fareed Alikhan
Gemma L. Kay
Leonardo de Oliveira Martins
Alp Aydin
David J. Baker
Alexander J. Trotter
Steven Rudder
Ana P. Tedim
Anastasia Kolyva
Rachael Stanley
Muhammad Yasir
Maria Diaz
Will Potter
Claire Stuart
Lizzie Meadows
Andrew Bell
Ana Victoria Gutierrez
Nicholas M. Thomson
Evelien M. Adriaenssens
Tracey Swingler
Rachel A. J. Gilroy
Luke Griffith
Dheeraj K. Sethi
Dinesh Aggarwal
Colin S. Brown
Rose K. Davidson
Robert A. Kingsley
Luke Bedford
Lindsay J. Coupland
Ian G. Charles
Ngozi Elumogo
John Wain
Reenesh Prakash
Mark A. Webber
S. J. Louise Smith
Meera Chand
Samir Dervisevic
Justin O’Grady
The COVID-19 Genomics UK (COG-UK) Consortium

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (ScreenIT)

Abstract

The COVID-19 pandemic has spread rapidly throughout the world. In the UK, the initial peak was in April 2020; in the county of Norfolk (UK) and surrounding areas, which has a stable, low-density population, over 3200 cases were reported between March and August 2020. As part of the activities of the national COVID-19 Genomics Consortium (COG-UK) we undertook whole genome sequencing of the SARS-CoV-2 genomes present in positive clinical samples from the Norfolk region. These samples were collected by four major hospitals, multiple minor hospitals, care facilities and community organizations within Norfolk and surrounding areas. We combined clinical metadata with the sequencing data from regional SARS-CoV-2 genomes to understand the origins, genetic variation, transmission and expansion (spread) of the virus within the region and provide context nationally. Data were fed back into the national effort for pandemic management, whilst simultaneously being used to assist local outbreak analyses. Overall, 1565 positive samples (172 per 100 000 population) from 1376 cases were evaluated; for 140 cases between two and six samples were available providing longitudinal data. This represented 42.6 % of all positive samples identified by hospital testing in the region and encompassed those with clinical need, and health and care workers and their families. In total, 1035 cases had genome sequences of sufficient quality to provide phylogenetic lineages. These genomes belonged to 26 distinct global lineages, indicating that there were multiple separate introductions into the region. Furthermore, 100 genetically distinct UK lineages were detected demonstrating local evolution, at a rate of ~2 SNPs per month, and multiple co-occurring lineages as the pandemic progressed. Our analysis: identified a discrete sublineage associated with six care facilities; found no evidence of reinfection in longitudinal samples; ruled out a nosocomial outbreak; identified 16 lineages in key workers which were not in patients, indicating infection control measures were effective; and found the D614G spike protein mutation which is linked to increased transmissibility dominates the samples and rapidly confirmed relatedness of cases in an outbreak at a food processing facility. The large-scale genome sequencing of SARS-CoV-2-positive samples has provided valuable additional data for public health epidemiology in the Norfolk region, and will continue to help identify and untangle hidden transmission chains as the pandemic evolves.

Version published to 10.1099/mgen.0.000589 on Access Microbiology
Mar 24, 2021

SciScore for 10.1101/2020.09.28.20201475: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Sequence analysis: Raw reads were demultiplexed using bcl2fastq (v2.20) (Illumina Inc.) allowing for zero mismatches in the dual barcodes to produce FASTQ files.	bcl2fastq suggested: (bcl2fastq , RRID:SCR_015058)
Briefly, read adapters were trimmed using TrimGalore (https://github.com/FelixKrueger/TrimGalore) and aligned to the Wuhan Hu-1 reference genome (accession MN908947.3) using BWA-MEM (v0.7.17) (Li 2013); ARTIC amplicons were masked and a consensus built using iVAR (v.1.2) …

SciScore for 10.1101/2020.09.28.20201475: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Sequence analysis: Raw reads were demultiplexed using bcl2fastq (v2.20) (Illumina Inc.) allowing for zero mismatches in the dual barcodes to produce FASTQ files.	bcl2fastq suggested: (bcl2fastq , RRID:SCR_015058)
Briefly, read adapters were trimmed using TrimGalore (https://github.com/FelixKrueger/TrimGalore) and aligned to the Wuhan Hu-1 reference genome (accession MN908947.3) using BWA-MEM (v0.7.17) (Li 2013); ARTIC amplicons were masked and a consensus built using iVAR (v.1.2) (Grubaugh et al. 2019).	TrimGalore suggested: None BWA-MEM suggested: (Sniffles, RRID:SCR_017619)

Results from OddPub: Thank you for sharing your code.

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
No funding statement was detected.
No protocol registration statement was detected.

Read the original source

Version published to 10.1101/2020.09.28.20201475 on medRxiv
Sep 30, 2020

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

This article has 31 authors:
1. Sofia Herrera Agüero
2. Aldo Sosa
3. Alexander Martínez
4. Ambar Moreno
5. César Roberto Conde Pereira
6. Claudia Gonzalez
7. Claudio Soto Garita
8. Daniel Ulate
9. Estela Cordero-Laurent
10. Hebleen Brenes
11. Isaac Miguel Sánchez
12. Jairo Mendez-Rico
13. Jessica Góndola
14. Jose Arturo Molina-Mora
15. Juliana Leite
16. Leticia Franco
17. Linda Mendoza
18. Lionel Gresh
19. Lucia De La Cruz
20. Mitzi Castro Paz
21. Monica Barahona
22. Naomi Iihoshi
23. Oris Chavarria
24. Priscila Born
25. Ruby Melany Aguillón
26. Ruth Carolina Vasquez Cordova
27. Selene Gonzalez
28. Sofia Carolina Alvarado Silva
29. Xochitl Sandoval López
30. Yvonne Imbert
31. Francisco Duarte-Martínez
This article has no evaluationsLatest version Jan 14, 2026
Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

This article has 15 authors:
1. Pulchérie Pelembi
2. Philippe Colson
3. Alain Farra
4. Ornella Anne Sibiro-Demi
5. Christian Noël Malaka
6. Aurélia Kwasiborski
7. Véronique Hourdel
8. Gilles Landry Ngaya
9. Romaric Nzoumbou-Boko
10. Jean-Claude Manuguerra
11. Emmanuel Ryvalin Nakoune-Yandoko
12. Guy VERNET
13. Bernard La Scola
14. Valérie Caro
15. Alexandre Manirakiza
This article has no evaluationsLatest version Jan 19, 2026
DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

This article has 13 authors:
1. Claudia Carranza
2. Lucia Ortiz
3. Maria Eugenia Castellanos
4. Ana Silvia Gonzalez-Reiche
5. Renata Mendizabal-Cabrera
6. Zain Khalil
7. Adriana van DeGuchte
8. Keith Farrugia
9. Mariana Herrera
10. Ernesto Mena
11. Celia Cordon-Rosales
12. Harm van Bakel
13. Daniel R. Perez
Reviewed by Access Microbiology

This article has 3 evaluationsLatest version Feb 3, 2026Latest activity Jul 20, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA