Prioritization of SARS-CoV-2 epitopes using a pan-HLA and global population inference approach

This article has been Reviewed by the following groups

Read the full article

Abstract

SARS-CoV-2 T cell response assessment and vaccine development may benefit from an approach that considers the global landscape of the human leukocyte antigen (HLA) proteins. We predicted the binding affinity between 9-mer and 15-mer peptides from the SARS-CoV-2 peptidome for 9,360 class I and 8,445 class II HLA alleles, respectively. We identified 368,145 unique combinations of peptide-HLA complexes (pMHCs) with a predicted binding affinity less than 500nM, and observed significant overlap between class I and II predicted pMHCs. Using simulated populations derived from worldwide HLA frequency data, we identified sets of epitopes predicted in at least 90% of the population in 57 countries. We also developed a method to prioritize pMHCs for specific populations. Collectively, this public dataset and accessible user interface (Shiny app: https://rstudio-connect.parkerici.org/content/13/ ) can be used to explore the SARS-CoV-2 epitope landscape in the context of diverse HLA types across global populations.

Article activity feed

  1. SciScore for 10.1101/2020.03.30.016931: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    The neoCOVID Explorer application was developed using the Shiny R package [REF] and deployed using RStudio-Connect.
    Shiny
    suggested: (Shiny, RRID:SCR_001626)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    This dataset and analysis have limitations. Our analysis was restricted to pMHC complexes with predicted binding affinities of less than 500nM. Subsequent analysis did not treat the predicted binding affinities as a continuous variable (i.e. predicted values of 5nM and 400nM were treated similarly in the remaining analysis). In the absence of experimental validation, we did not try to over delineate the association between HLA diversity and the predicted binding affinity. Furthermore, utilizing a threshold of 500nM may result in underestimating the number of alleles associated with the predicted antigenic peptides. Our predictions were limited to 9-mers and 15-mers, which represent most but not all reported HLA class I and class II binding peptides. Our data also does not account for either the quantity or timing of viral protein expression in a host cell, both of which can impact the immunogenicity of predicted epitopes (Croft et al., 2019). Finally, analysis of global population frequencies was restricted to a limited number of HLA alleles and countries. While AFND is the most comprehensive database summarizing the population frequencies of HLA haplotypes, it is far from complete. Frequencies are reported for 73, 73, and 49 countries for genes HLA-A, -B, and -C, respectively. In addition, the number of alleles reported for each gene is variable across countries, ranging from 1-1,498. In summary, our resource provides a pan-HLA tool for those seeking to study SARS-CoV-2 or v...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.