Genomic epidemiology of SARS-CoV-2 in the UAE reveals novel virus mutation, patterns of co-infection and tissue specific host immune response

Abstract

To unravel the source of SARS-CoV-2 introduction and the pattern of its spreading and evolution in the United Arab Emirates, we conducted meta-transcriptome sequencing of 1067 nasopharyngeal swab samples collected between May 9th and Jun 29th, 2020 during the first peak of the local COVID-19 epidemic. We identified global clade distribution and eleven novel genetic variants that were almost absent in the rest of the world and that defined five subclades specific to the UAE viral population. Cross-settlement human-to-human transmission was related to the local business activity. Perhaps surprisingly, at least 5% of the population were co-infected by SARS-CoV-2 of multiple clades within the same host. We also discovered an enrichment of cytosine-to-uracil mutation among the viral population collected from the nasopharynx, that is different from the adenosine-to-inosine change previously reported in the bronchoalveolar lavage fluid samples and a previously unidentified upregulation of APOBEC4 expression in nasopharynx among infected patients, indicating the innate immune host response mediated by ADAR and APOBEC gene families could be tissue-specific. The genomic epidemiological and molecular biological knowledge reported here provides new insights for the SARS-CoV-2 evolution and transmission and points out future direction on host–pathogen interaction investigation.

SciScore for 10.1101/2021.03.09.21252822: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Briefly, total reads were processed using Kraken v0.10.5 (default parameters) with a self-built database of Coronaviridae genomes (including SARS, MERS, and SARS-CoV-2 genome sequences downloaded from GISAID, NCBI, and CNGB) to identify Coronaviridae-like reads in a sensitive manner.	Kraken suggested: (Kraken, RRID:SCR_005484)
Fastp v0.19.5 (parameters: -q 20 -u 20 -n 1 −l 50) and SOAPnuke v1.5.6 (parameters: −l 20 -q 0.2 -E 50 -n 0.02 −5 0 -Q 2 -G -d) were used to remove low-quality reads, duplications, and adaptor contaminations.	Fastp suggested: (fastp, RRID:SCR_016962) SOAPnuke suggested: …

SciScore for 10.1101/2021.03.09.21252822: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Briefly, total reads were processed using Kraken v0.10.5 (default parameters) with a self-built database of Coronaviridae genomes (including SARS, MERS, and SARS-CoV-2 genome sequences downloaded from GISAID, NCBI, and CNGB) to identify Coronaviridae-like reads in a sensitive manner.	Kraken suggested: (Kraken, RRID:SCR_005484)
Fastp v0.19.5 (parameters: -q 20 -u 20 -n 1 −l 50) and SOAPnuke v1.5.6 (parameters: −l 20 -q 0.2 -E 50 -n 0.02 −5 0 -Q 2 -G -d) were used to remove low-quality reads, duplications, and adaptor contaminations.	Fastp suggested: (fastp, RRID:SCR_016962) SOAPnuke suggested: (SOAPnuke, RRID:SCR_015025)
Low-complexity reads were then removed using PRINSEQ v0.20.4 (parameters: -lc_method dust -lc_threshold 7).	PRINSEQ suggested: (PRINSEQ, RRID:SCR_005454)
Sequencing depth was measured using samtools depth using the default parameters.	samtools suggested: (SAMTOOLS, RRID:SCR_002105)
SARS-CoV-2 consensus sequences were generated using Pilon v1.23 (parameters: --changes –vcf --changes --vcf --mindepth 10 --fix all, amb)16.	Pilon suggested: (Pilon , RRID:SCR_014731)
We have also applied de novo assembly of the Coronaviridae-like reads from samples with < 100× average sequencing depth using SPAdes (v3.14.0) with the default settings.	SPAdes suggested: (SPAdes, RRID:SCR_000131)
Jalview (v1.8.3) was used to perform multiple sequence alignment and estimate the conservativeness score of the mutations18.	Jalview suggested: (Jalview, RRID:SCR_006459)
Analysis of host ADAR and APOBEC gene expression: Reads were aligned to the human genome reference (GRCh38) using hisat2 (parameters: --phred64 --no-discordant --no-mixed -I 1 -X 1000 -p 4).	hisat2 suggested: (HISAT2, RRID:SCR_015530)
We built a maximum likelihood phylogenetic tree using the Nextstrain pipeline; Augur v6.4.3 and MAFFT v7.455 for multiple sequence alignment and IQtree v1.6.12 for phylogenetic tree construction (25).	Augur suggested: None MAFFT suggested: (MAFFT, RRID:SCR_011811) IQtree suggested: None
FigTree v1.4.4 was used to visualize and annotate the phylogenetic tree.	FigTree suggested: (FigTree, RRID:SCR_008515)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Genomic epidemiology of SARS-CoV-2 in the UAE reveals novel virus mutation, patterns of co-infection and tissue specific host immune response

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

Overview of SARS-CoV-2 Genomic Surveillance in Central America and the Dominican Republic from February 2020 to January 2023: The Impact of PAHO and COMISCA's Collaborative Efforts