Sequence-based prediction of vaccine targets for inducing T cell responses to SARS-CoV-2 utilizing the bioinformatics predictor RECON

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Background

The ongoing COVID-19 pandemic has created an urgency to identify novel vaccine targets for protective immunity against SARS-CoV-2. Consistent with observations for SARS-CoV, a closely related coronavirus responsible for the 2003 SARS outbreak, early reports identify a protective role for both humoral and cell-mediated immunity for SARS CoV-2.

Methods

In this study, we leveraged HLA-I and HLA-II T cell epitope prediction tools from RECON® (Real-time Epitope Computation for ONcology), our bioinformatic pipeline that was developed using proteomic profiling of individual HLA-I and HLA-II alleles to predict rules for peptide binding to a diverse set of such alleles. We applied these binding predictors to viral genomes from the Coronaviridae family, and specifically to identify SARS-CoV-2 T cell epitopes.

Results

To test the suitability of these tools to identify viral T cell epitopes, we first validated HLA-I and HLA-II predictions on Coronaviridae family epitopes deposited in the Virus Pathogen Database and Analysis Resource (ViPR) database. We then use our HLA-I and HLA-II predictors to identify 11,776 HLA-I and 7,991 HLA-II candidate binding peptides across all 12 open reading frames (ORFs) of SARS-CoV-2. This extensive list of identified candidate peptides is driven by the length of the ORFs and the significant number of HLA-I and HLA-II alleles that we are able to predict (74 and 83, respectively), providing over 99% coverage for the US, European and Asian populations, for both HLA-I and HLA-II. From our SARS-CoV-2 predicted peptide-HLA-I allele pairs, 368 pairs identically matched previously reported pairs in the ViPR database, originating from other forms of coronaviruses. 320 of these pairs (89.1%) had a positive MHC-binding assay result. This analysis reinforces the validity our predictions.

Conclusions

Using this bioinformatic platform, we identify multiple putative epitopes for CD4 + and CD8 + T cells whose HLA binding properties cover nearly the entire population and thus may be effective when included in prophylactic vaccines against SARS-CoV-2 to induce broad cellular immunity.

Article activity feed

  1. SciScore for 10.1101/2020.04.06.027805: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Comparison of predicted epitopes to the human proteome: 8-12mer sequences (corresponding to predicted HLA-I epitopes), 9mer sequences (corresponding to predicted HLA-II binding cores), and 25mer sequences (corresponding to predicted HLA-II sequences that bound multiple alleles) from SARS-CoV-2 were compared against sub-sequences of the same length from the human proteome, using UCSC Genome Browser genes with hg19 annotation of the human genome and its protein coding transcripts (63,691 entries) (34).
    UCSC Genome Browser
    suggested: (UCSC Genome Browser, RRID:SCR_005780)

    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.