A Genome-Wide Codon-Permissiveness Framework Uncovers Spike-Centric Escape Hotspots and Distal Epistatic Couplings Across SARS-CoV-2 Structural Proteins

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Immune escape mutations in SARS-CoV-2 are not randomly distributed, yet current methods for prioritizing functionally consequential residues remain heavily biased by prior experimental data or literature-curated escape maps. To overcome this limitation, we introduce a fully de novo, data-driven framework that identifies evolutionarily pivotal sites using only codon usage constraints across 9.4 million high-quality SARS-CoV-2 genomes (2020-2025). We integrated six orthogonal codon bias metrics into a unified CUB6 Metric Suite (C6MS) to compute a Codon-Permissiveness Score (CPS) for every residue in the receptor-binding domain (RBD). By combining CPS with observed mutational frequency, we mapped high-permissiveness, high-mutation residues onto the hACE2 interface (PDB: 6M0J; ≤5 Å cutoff). This revealed a core set of 12 key residues including F486, L452, and K444 that form a statistically robust intra-Spike epistatic network (χ² p < 1×10⁻¹⁵ mutual information > 0.8) and exhibit accelerated global frequency increases from 2020 to 2025. Notably, N450 which is a site absent from conventional experimental escape maps displays high codon-permissiveness (Shannon entropy = 0.19) and has accumulated 13 distinct mutations, predominantly L450N (97.1%) and L450D (2.8%), indicating active, evolutionarily stable diversification. In contrast, residues like G447 and V483 now show low entropy due to near-fixation (N447G: 99.998%; E483V: 99.95%), yet their rapid global sweeps confirm they were critical permissive hotspots during earlier immune escape waves. All three surpassed 15% global frequency by early 2025 and continue to shape emerging variant fitness. Strikingly, while immune escape remains predominantly modular and confined to Spike, our analysis detects recurrent co-occurrence between non-RBD Spike variants and Membrane: D3G which is likely reflecting shared lineage history. In contrast, high-permissiveness RBD residues (e.g., N450, L452) show no such dependencies, underscoring their evolutionary autonomy. This insight transforms therapeutic strategy: monoclonal antibodies (mAbs) targeting autonomous, codon-permissive sites like N450 can be engineered based solely on local conformational plasticity and predicted mutational spectra, dramatically simplifying development and extending therapeutic shelf-life. By proactively accommodating evolutionary trajectories (e.g., L450N/D), even with modest affinity trade-offs, we shift mAb design from reactive to predictive now informed not only by local Spike plasticity but also by emerging signals of genome-wide epistatic constraints. Our framework, requiring no prior experimental annotation, defines a Codon-Permissive Epistatic Backbone (CpEB) that explains variant success, enables evolution-informed surveillance, and is immediately generalizable to other pathogens, including H5N1.

Article activity feed