Predicting the mutational drivers of future SARS-CoV-2 variants of concern

Abstract

SARS-CoV-2 evolution threatens vaccine- and natural infection–derived immunity and the efficacy of therapeutic antibodies. To improve public health preparedness, we sought to predict which existing amino acid mutations in SARS-CoV-2 might contribute to future variants of concern. We tested the predictive value of features comprising epidemiology, evolution, immunology, and neural network–based protein sequence modeling and identified primary biological drivers of SARS-CoV-2 intrapandemic evolution. We found evidence that ACE2-mediated transmissibility and resistance to population-level host immunity has waxed and waned as a primary driver of SARS-CoV-2 evolution over time. We retroactively identified with high accuracy (area under the receiver operator characteristic curve = 0.92 to 0.97) mutations that will spread, at up to 4 months in advance, across different phases of the pandemic. The behavior of the model was consistent with a plausible causal structure where epidemiological covariates combine the effects of diverse and shifting drivers of viral fitness. We applied our model to forecast mutations that will spread in the future and characterize how these mutations affect the binding of therapeutic antibodies. These findings demonstrate that it is possible to forecast the driver mutations that could appear in emerging SARS-CoV-2 variants of concern. We validated this result against Omicron, showing elevated predictive scores for its component mutations before emergence and rapid score increase across daily forecasts during emergence. This modeling approach may be applied to any rapidly evolving pathogens with sufficiently dense genomic surveillance data, such as influenza, and unknown future pandemic viruses.

Article activity feed

SciScore for 10.1101/2021.06.21.21259286: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
G Epsilon, B.1.427 (additional associated mutations): S13I, W152C B.1.429 (CDC VOCs): S13I, W152C, L452R, D614G SARS-CoV-2 pseudotyped VSV production and neutralization: To generate SARS-CoV-2 pseudotyped vesicular stomatitis virus, Lenti-X 293T cells (Takara) were seeded in 10-cm dishes for 80%.	293T suggested: None
For viral neutralization, Vero E6 cells were seeded into black-walled, clear-bottom 96-well plates at 20,000 cells/well and cultured overnight at 37°C.	Vero E6 suggested: None
Software and Algorithms
Sentences	Resources
Code availability and environment: Analyses on GISAID …

SciScore for 10.1101/2021.06.21.21259286: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Experimental Models: Cell Lines
Sentences	Resources
G Epsilon, B.1.427 (additional associated mutations): S13I, W152C B.1.429 (CDC VOCs): S13I, W152C, L452R, D614G SARS-CoV-2 pseudotyped VSV production and neutralization: To generate SARS-CoV-2 pseudotyped vesicular stomatitis virus, Lenti-X 293T cells (Takara) were seeded in 10-cm dishes for 80%.	293T suggested: None
For viral neutralization, Vero E6 cells were seeded into black-walled, clear-bottom 96-well plates at 20,000 cells/well and cultured overnight at 37°C.	Vero E6 suggested: None
Software and Algorithms
Sentences	Resources
Code availability and environment: Analyses on GISAID data extracts were conducted in python (Python Software Foundation.	python suggested: (IPython, RRID:SCR_001658)
Available at http://www.python.org).	http://www.python.org suggested: (CVXOPT - Python Software for Convex Optimization, RRID:SCR_002918)
Mutations were then extracted as compared to the reference with R 4.0.2 (https://www.r-project.org/) using Biostrings 2.56.0 (https://bioconductor.org/packages/Biostrings) and haplotypes were obtained by combining all amino acid mutations (substitutions, insertions, and deletions) identified on the Spike protein when compared to the reference sequence.	https://www.r-project.org/ suggested: (R Project for Statistical Computing, RRID:SCR_001905) Biostrings suggested: (Biostrings, RRID:SCR_016949)
Natural selection features were generated using MEME45 and FEL23 methods implemented in the HyPhy package24 (version 2.5.31).	HyPhy suggested: (HyPhy, RRID:SCR_016162)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Results from scite Reference Check: We found no unreliable references.

Read the original source

Pulchérie Pelembi
Philippe Colson
Alain Farra
Ornella Anne Sibiro-Demi
Christian Noël Malaka
Aurélia Kwasiborski
Véronique Hourdel
Gilles Landry Ngaya
Romaric Nzoumbou-Boko
Jean-Claude Manuguerra
Emmanuel Ryvalin Nakoune-Yandoko
Guy VERNET
Bernard La Scola
Valérie Caro
Alexandre Manirakiza

An HLA Association With COVID-19 Vaccine Reactogenicity Correlates With Fewer SARS-CoV-2 Infections and Monocyte Activation

Jill Hollenbach
Anshika Srivastava
Demetra Chatzileontiadou
Anurag Adhikari
Rayo Suseno
Sean Lin
Juliano Boquett
Jamie Tuibeo
Tasneem Yusufali
Noah Peyser
Ticiana Farias
Katherine Kichula
Andrea Nguyen
Irvin Jose
Dhilshan Jayasinghe
Katerina Tarassi
Elissavet Kontou
Janesha Maddumage
Kleio Ampelakiotou
Alexandra Tsirogianni
Michael Dewar-Oldis
Peter Barnard
Joe Sabatino
Dimitrios Zoulas
Emily Ariens
Timothy Mercer
Emma Grant
Lloyd D'Orsogna
Corey Smith
Paul Norman
Gregory Marcus
Jeffrey Olgin
Mark J. Pletcher
Martin Maiers
Stephanie Gras

Multi-epitope vaccine construct against bovine tuberculosis: insights from immunoinformatics and molecular dynamics simulations

Truc Ly Nguyen

Predicting the mutational drivers of future SARS-CoV-2 variants of concern

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

An HLA Association With COVID-19 Vaccine Reactogenicity Correlates With Fewer SARS-CoV-2 Infections and Monocyte Activation

Multi-epitope vaccine construct against bovine tuberculosis: insights from immunoinformatics and molecular dynamics simulations

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Genomic characterization of SARS-CoV-2 variants circulating in the population of Bangui, Central African Republic (CAR) in 2022.

An HLA Association With COVID-19 Vaccine Reactogenicity Correlates With Fewer SARS-CoV-2 Infections and Monocyte Activation

Multi-epitope vaccine construct against bovine tuberculosis: insights from immunoinformatics and molecular dynamics simulations