Predicting SARS-CoV-2 evolution dynamics with spatiotemporal resolution by DMS-empowered protein language model

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Early identification of emerging dominant SARS-CoV-2 variants is essential for effective pandemic preparedness, yet existing methodologies face significant limitations. Experimental characterizations are costly and not feasible for real-time surveillance, whereas existing computational approaches cannot achieve satisfactory precision in predicting future dominant lineages and fail to capture the spatiotemporal dynamics of fitness under evolving host immune pressures. Here, we introduce DeepCoV (DMS-Empowered Evolution Prediction of CoronaVirus), a deep-learning framework for the dynamic identification of novel variants with high potential to become prevelent. It integrates deep mutational scanning (DMS)-derived mutation phenotypes with epidemiological surveillence data reflecting historical viral evolution and the dynamic fitness landscape. DeepCoV accurately forecasted the dominance of recently circulating lineages a month in advance, achieving a 90% reduction in false discovery rate while capturing temporal and geographic dynamics of variant spread and reconstructing their regional prevalence trajectories. Moreover, DeepCoV identified mutational hotspots of Omicron-derived backbones in silico , revealing convergent evolution trends. This scalable solution enables timely identification of immune-evasive variants and prospective alert of critical mutations, providing actionable insights for vaccine updates and pandemic surveillance.

Article activity feed