Peptidome Surveillance Across Evolving SARS-CoV-2 Lineages Reveals HLA Binding Conservation in Nucleocapsid Among Variants With Most Potential for T-Cell Epitope Loss in Spike

This article has been Reviewed by the following groups

Read the full article

Abstract

To provide a unique global view of the relative potential for evasion of CD8+ and CD4+ T cells by SARS-CoV-2 lineages as they evolve over time, we performed a comprehensive analysis of predicted HLA-I and HLA-II binding peptides in Spike (S) and Nucleocapsid (N) protein sequences of all available SARS-CoV-2 genomes as provided by NIH NCBI at a bi-monthly interval between March and December of 2021. A data supplement of all B.1.1.529 (Omicron) genomes from GISAID in early December was also used to capture the rapidly spreading variant. A key finding is that throughout continued viral evolution and increasing rates of mutations occurring at T-cell epitope hotspots, protein instances with worst-case binding loss did not become the most frequent for any Variant of Concern (VOC) or Variant of Interest (VOI) lineage; suggesting T-cell evasion is not likely to be a dominant evolutionary pressure on SARS-CoV-2. We also determined that throughout the course of the pandemic in 2021, there remained a relatively steady ratio of viral variants that exhibit conservation of epitopes in the N protein, despite significant potential for epitope loss in S relative to other lineages. We further localized conserved regions in N with high epitope yield potential, and illustrated heterogeneity in HLA-I binding across the S protein consistent with empirical observations. Although Omicron’s high volume of mutations caused it to exhibit more epitope loss potential than most frequently observed versions of proteins in almost all other VOCs, epitope candidates across its most frequent N proteins were still largely conserved. This analysis adds to the body of evidence suggesting that N may have merit as an additional antigen to elicit immune responses to vaccination with increased potential to provide sustained protection against COVID-19 disease in the face of emerging variants.

Article activity feed

  1. SciScore for 10.1101/2022.03.18.484954: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Each downloaded genome was aligned to reference using MAFFT (52).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    A key limitation of this study is that our analysis is built upon predictions of peptide binding to HLA-I and HLA-II molecules, which is known to be necessary but not sufficient, for T-cell recognition. Not all predicted binders are likely to be T-cell epitopes, and thus not all changes in HLA binding prediction are guaranteed to lead to epitope loss. We aimed to minimize this disparity with our approach to integrating predictions across peptides, lengths, and HLAs; and by focusing on relative change at hotspots of potential epitopes. But ultimately there are unmodeled factors between HLA presentation and T-cell response, and an additional layer of predictive tools trained directly for the task may be necessary to improve precision in the future. A secondary limitation to the analyses presented herein is that our conclusions are based around a representative set of HLAs selected to cover the most general picture of the world population (and be inclusive of any minority but distinctly functioning HLA groups). Thus conclusions may vary if the analysis were to be repeated with a different HLA set, for example one specific to a population or region. We also encountered very sparse coverage in population statistics of HLA-II alpha-beta haplotypes. To overcome this, we selected our HLA-II set based on all individual alpha and beta chain frequencies and considered all permutations, and may have incurred a risk of overrepresenting the impact of some alleles. Finally, it is important ...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    Results from scite Reference Check: We found no unreliable references.


    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.