Immunoinformatic identification of B cell and T cell epitopes in the SARS-CoV-2 proteome

This article has been Reviewed by the following groups

Read the full article

Abstract

A novel coronavirus (SARS-CoV-2) emerged from China in late 2019 and rapidly spread across the globe, infecting millions of people and generating societal disruption on a level not seen since the 1918 influenza pandemic. A safe and effective vaccine is desperately needed to prevent the continued spread of SARS-CoV-2; yet, rational vaccine design efforts are currently hampered by the lack of knowledge regarding viral epitopes targeted during an immune response, and the need for more in-depth knowledge on betacoronavirus immunology. To that end, we developed a computational workflow using a series of open-source algorithms and webtools to analyze the proteome of SARS-CoV-2 and identify putative T cell and B cell epitopes. Utilizing a set of stringent selection criteria to filter peptide epitopes, we identified 41 T cell epitopes (5 HLA class I, 36 HLA class II) and 6 B cell epitopes that could serve as promising targets for peptide-based vaccine development against this emerging global pathogen. To our knowledge, this is the first study to comprehensively analyze all 10 (structural, non-structural and accessory) proteins from SARS-CoV-2 using predictive algorithms to identify potential targets for vaccine development.

Article activity feed

  1. SciScore for 10.1101/2020.05.14.093757: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Comparison of genome sequences from SARS-CoV-2 isolates: Genomic sequences for reported SARS-CoV-2 isolates were identified and retrieved from the Virus Pathogen Resource (ViPR) database on February 27, 2020 (https://www.viprbrc.org/brc/home.spg?decorator=corona_ncov).
    ViPR
    suggested: (vipR, RRID:SCR_010685)
    Remaining sequences were aligned using the Clustal Omega program (version 1.2.4) from the European Bioinformatics Institute (39) and compared against the first reported genome sequence for SARS-CoV-2 (Wuhan-Hu-1; taxonomy ID: 2697049) (1).
    Clustal Omega
    suggested: (Clustal Omega, RRID:SCR_001591)
    Peptides with scores above this threshold were subsequently analyzed on the NetMHCpan 4.0 server (Technical University of Denmark) to predict binding affinity and percentile rank across representative alleles of each major HLA class I supertype (HLA-A*01:01, HLA-A*02:01, HLA-A*03:01, HLA-A*24:02, HLA-B*07:02, HLA-B*08:01, HLA-B*27:05, HLA-B*40:01, HLA-B*58:01, HLA-B* 15:01), which collectively cover the majority of class I alleles present in the human population (42–44).
    NetMHCpan
    suggested: (NetMHCpan Server, RRID:SCR_018182)
    HLA class I and class II peptides with high predicted binding affinities (≤ 500 nM), high percentile ranks (≤ 0.5% for class I; ≤ 2% for class II), and broad HLA coverage (> 3 alleles) were independently analyzed on the VaxiJen 2.0 server (Edward Jenner Institute) (46, 47) using a conservative score threshold (0.7) to predict antigenicity.
    VaxiJen
    suggested: (VaxiJen, RRID:SCR_018514)
    The main protein structure was modeled in PyMOL (Schrödinger, LLC), with predicted B cell epitopes identified by both BepiPred 1.0 and DiscoTope 1.1 highlighted as spheres.
    PyMOL
    suggested: (PyMOL, RRID:SCR_000305)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    Our study possessed several strengths and limitations. Rather than restricting our analyses of HLA class I and class II epitopes to specific proteins based on prior studies of SARS-CoV immunology, we investigated the complete proteome of SARS-CoV-2 using an unbiased approach. Furthermore, we employed a multi-tiered strategy for identifying putative B cell and T cell epitopes from all viral proteins studied. Our initial analyses were performed with liberal thresholds for epitope identification, and at each additional step, we imposed more stringent selection criteria to filter these peptides to a subset of B cell and T cell epitopes for further study. Nevertheless, the results of this study are derived purely from computational methods, and it should be noted that computational algorithms can fail to capture a significant number of antigenic peptides (69). Experimental validation with biological samples will ultimately be needed. During the early stages of a pandemic, access to sufficient biological samples may be extremely limited, so we must continue to utilize methodologies—such as computational predictive algorithms— that allow us to explore the epitope landscape for experimental vaccine development. Our approach in this study allowed us to identify and refine a manageable subset of T cell and B cell epitopes for further testing as components of a SARS-CoV-2 vaccine. Based on our results, our proposed SARS-CoV-2 vaccine formulation could contain the following: 1) one or mo...

    Results from TrialIdentifier: We found the following clinical trial numbers in your paper:

    IdentifierStatusTitle
    NCT04324606Active, not recruitingA Study of a Candidate COVID-19 Vaccine (COV001)
    NCT04283461Active, not recruitingSafety and Immunogenicity Study of 2019-nCoV Vaccine (mRNA-1…
    NCT04336410Active, not recruitingSafety, Tolerability and Immunogenicity of INO-4800 for COVI…
    NCT04352608Active, not recruitingSafety and Immunogenicity Study of Inactivated Vaccine for P…


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.