Comparative Genomics and Characterization of SARS-CoV-2 P.1 (Gamma) Variant of Concern From Amazonas, Brazil
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
P.1 lineage (Gamma) was first described in the State of Amazonas, northern Brazil, in the end of 2020, and has emerged as a very important variant of concern (VOC) of SARS-CoV-2 worldwide. P.1 has been linked to increased infectivity, higher mortality, and immune evasion, leading to reinfections and potentially reduced efficacy of vaccines and neutralizing antibodies.
Methods
The samples of 276 patients from the State of Amazonas were sent to a central referral laboratory for sequencing by gold standard techniques, through Illumina MiSeq platform. Both global and regional phylogenetic analyses of the successfully sequenced genomes were conducted through maximum likelihood method. Multiple alignments were obtained including previously obtained unique human SARS-CoV-2 sequences. The evolutionary histories of spike and non-structural proteins from ORF1a of northern genomes were described and their molecular evolution was analyzed for detection of positive (FUBAR, FEL, and MEME) and negative (FEL and SLAC) selective pressures. To further evaluate the possible pathways of evolution leading to the emergence of P.1, we performed specific analysis for copy-choice recombination events. A global phylogenomic analysis with subsampled P.1 and B.1.1.28 genomes was applied to evaluate the relationship among samples.
Results
Forty-four samples from the State of Amazonas were successfully sequenced and confirmed as P.1 (Gamma) lineage. In addition to previously described P.1 characteristic mutations, we find evidence of continuous diversification of SARS-CoV-2, as rare and previously unseen P.1 mutations were detected in spike and non-structural protein from ORF1a. No evidence of recombination was found. Several sites were demonstrated to be under positive and negative selection, with various mutations identified mostly in P.1 lineage. According to the Pango assignment, phylogenomic analyses indicate all samples as belonging to the P.1 lineage.
Conclusion
P.1 has shown continuous evolution after its emergence. The lack of clear evidence for recombination and the positive selection demonstrated for several sites suggest that this lineage emergence resulted mainly from strong evolutionary forces and progressive accumulation of a favorable signature set of mutations.
Article activity feed
-
-
SciScore for 10.1101/2021.10.30.21265694: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics not detected. Sex as a biological variable not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources The alignment of the sequenced reads to the reference SARS-CoV-2 genome (GenBank ID: NC_045512.2) was performed by Bowtie v2.4.2 (12) and additional parameters as end-to-end and very-sensitive. Bowtiesuggested: (Bowtie, RRID:SCR_005476)The analysis of the sequencing coverage and depth was generated by samtools v1.11 (13) with minimum base quality per base (Q) ≥ 30. samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)The genetic distance plot comparison of the 44 sequenced genomes from this study with the … SciScore for 10.1101/2021.10.30.21265694: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics not detected. Sex as a biological variable not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Table 2: Resources
Software and Algorithms Sentences Resources The alignment of the sequenced reads to the reference SARS-CoV-2 genome (GenBank ID: NC_045512.2) was performed by Bowtie v2.4.2 (12) and additional parameters as end-to-end and very-sensitive. Bowtiesuggested: (Bowtie, RRID:SCR_005476)The analysis of the sequencing coverage and depth was generated by samtools v1.11 (13) with minimum base quality per base (Q) ≥ 30. samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)The genetic distance plot comparison of the 44 sequenced genomes from this study with the NC_045512.2 reference was performed with the Python package recan using a window of 200 nt and a shift of 50 nt as parameters (28). Pythonsuggested: (IPython, RRID:SCR_001658)Molecular evolution of spike and non-structural proteins from ORF1a of northern samples: Selection tests were performed with HyPhy v2.5.32 (31) using the nucleotide sequence alignment and the maximum likelihood tree (previously described) for spike and non-structural proteins from ORF1a. HyPhysuggested: (HyPhy, RRID:SCR_016162)The multiple sequence alignment with the P.1, B.1.1.28, the 44 sequenced genomes from this study, and the SARS-CoV-2 reference genome NC_045512.2 was generated by the MAFFT web server (1PAM / κ=2 scoring matrix), while the alignment trimming (deletion of 265 and 259 nucleotides at 5’ and 3’ ends, respectively) was performed with UGENE (17). MAFFTsuggested: (MAFFT, RRID:SCR_011811)The tree visualization and editing was generated by the FigTree software (http://tree.bio.ed.ac.uk/software/figtree/). FigTreesuggested: (FigTree, RRID:SCR_008515)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Considering the limitations related to a subsampled global phylogeny analysis, it is not possible to know if the absence of well-supported monophyletic groups is due to the missing branches and nodes representing the most related sequences and their ancestors or some technical limitation inherent to the application of the branch support statistical tests to the SARS-CoV-2 genomic sequences, since bootstrapping approaches require multiple sites supporting a clade to infer strong support value in near-perfect trees (57). In fact, SARS-CoV-2 genomes present a low number of informative sites, which may generate topology with low statistical support and ambiguous clustering of large data sets (58). Additionally, both SH-aLRT and ultrafast bootstrap methods rely on bootstrap resampling (57). Considering all aspects discussed above and the divergence among sequences, it is expected that the P.1 genomes (specially in the spike sequence analysis) cluster together and phylogenetic trees including multiple lineages present better statistical support values inter-lineages in comparison with intra-lineages. In summary, the association of diversification of P.1 sequences, the known phenotypic consequences of some signature mutations, the confirmation of positive selection acting on some sites and the absence of evidence for recombinations, all suggest that the main driving force in the evolution of P.1 viruses was selective pressure.
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-