Genetic surveillance of SARS-CoV-2 M pro reveals high sequence and structural conservation prior to the introduction of protease inhibitor Paxlovid

This article has been Reviewed by the following groups

Read the full article

Abstract

SARS-CoV-2 continues to represent a global health emergency as a highly transmissible, airborne virus. An important coronaviral drug target for treatment of COVID-19 is the conserved main protease (M pro ). Nirmatrelvir is a potent M pro inhibitor and the antiviral component of Paxlovid™. The significant viral sequencing effort during the ongoing COVID-19 pandemic represented a unique opportunity to assess potential nirmatrelvir escape mutations from emerging variants of SARS-CoV-2. To establish the baseline mutational landscape of M pro prior to the introduction of M pro inhibitors, M pro sequences and its cleavage junction regions were retrieved from ∼4,892,000 high-quality SARS-CoV-2 genomes in GISAID. Any mutations identified from comparison to the reference sequence (Wuhan-hu-1) were cataloged and analyzed. Mutations at sites key to nirmatrelvir binding and protease functionality (e.g., dimerization sites) were still rare. Structural comparison of M pro also showed conservation of key nirmatrelvir contact residues across the extended Coronaviridae family (alpha-, beta-, and gamma-coronaviruses). Additionally, we showed that over time the SARS-CoV-2 M pro enzyme remained under purifying selection and was highly conserved relative to the spike protein. Now, with the EUA approval of Paxlovid and its expected widespread use across the globe, it is essential to continue large-scale genomic surveillance of SARS-CoV-2 M pro evolution. This study establishes a robust analysis framework for monitoring emergent mutations in millions of virus isolates, with the goal of identifying potential resistance to present and future SARS-CoV-2 antivirals.

Importance

The recent authorization of oral SARS-CoV-2 antivirals, such as Paxlovid, has ushered in a new era of the COVID-19 pandemic. Emergence of new variants, as well as selective pressure imposed by antiviral drugs themselves, raise concern for potential escape mutations in key drug binding motifs. To determine the potential emergence of antiviral resistance in globally circulating isolates and its implications for the clinical response to the COVID-19 pandemic, sequencing of SARS-CoV-2 viral isolates before, during, and after the introduction of new antiviral treatments is critical. The infrastructure built herein for active genetic surveillance of M pro evolution and emergent mutations will play an important role in assessing potential antiviral resistance as the pandemic progresses and M pro inhibitors are introduced. We anticipate our framework to be the starting point in a larger effort for global monitoring of the SARS-CoV-2 M pro mutational landscape.

Article activity feed

  1. SciScore for 10.1101/2022.03.29.486331: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Mpro nucleotide sequences were obtained using BLASTN alignment (26) to the reference SARS-CoV-2 genome (NC_045512.2, isolate Wuhan-hu-1) (
    BLASTN
    suggested: (BLASTN, RRID:SCR_001598)
    Each subset of genomes was then aligned to the reference genome (Wuhan-hu-1) using MAFFT (30) (with –6mer pair flag for rapid alignment of large numbers of closely related viral genomes).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    Overall nucleotide diversity was inferred using MEGA X (31).
    MEGA
    suggested: (Mega BLAST, RRID:SCR_011920)
    Runs were compared for convergence and the resulting dN/dS determined using RStudio (version 1.1.383).
    RStudio
    suggested: (RStudio, RRID:SCR_000432)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Another caveat of using GISAID datasets is that only consensus genome sequences are available. Potential emerging resistant mutations usually have low frequency (minor allele) within viral quasi-species and will not be uncovered from assembled genomic contigs. The presence of artifacts in assembled sequencing data is also expected due to inevitable errors in the sequencing process. While GISAID has implemented internal checks to flag potential errors in submitted assemblies, this does not eliminate the potential risk of misinterpreting artifacts as mutations. Nonetheless, the vast number of sequences available for analysis (>7 million SARS-CoV-2 genomes as of January 14, 2022) proved valuable in providing a comprehensive picture of the mutational landscape of Mpro. At present, SARS-CoV-2 continues to represent a global health threat as new variants emerge. It is essential to continue tracking Mpro mutations in global viral isolates, especially since nirmatrelvir, the active protease inhibitor in Paxlovid, is expected to become a widely accessible COVID-19 treatment option. However, at present, nirmatrelvir has yet to be deployed on a mass scale. Following FDA approval of remdesivir, its widespread usage in hospitals for the first year and a half of the COVID-19 pandemic has permitted analyses of known resistance mutations in viral isolates under remdesivir selection (57). Therefore, as more sampled viral isolates undergo nirmatrelvir selection, and as more sequences become av...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.