An integrated in silico immuno-genetic analytical platform provides insights into COVID-19 serological and vaccine targets

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

During COVID-19, diagnostic serological tools and vaccines have been developed. To inform control activities in a post-vaccine surveillance setting, we have developed an online “immuno-analytics” resource that combines epitope, sequence, protein and SARS-CoV-2 mutation analysis. SARS-CoV-2 spike and nucleocapsid proteins are both vaccine and serological diagnostic targets. Using the tool, the nucleocapsid protein appears to be a sub-optimal target for use in serological platforms. Spike D614G (and nsp12 L314P) mutations were most frequent (> 86%), whilst spike A222V/L18F have recently increased. Also, Orf3a proteins may be a suitable target for serology. The tool can accessed from: http://genomics.lshtm.ac.uk/immuno (online); https://github.com/dan-ward-bio/COVID-immunoanalytics (source code).

Article activity feed

  1. SciScore for 10.1101/2020.05.11.089409: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Whole genome sequence data analysis: SARS-CoV-2 nucleotide sequences were downloaded from NCBI (https://www.ncbi.nlm.nih.gov) and GISAID (https://www.gisaid.org).
    https://www.ncbi.nlm.nih.gov
    suggested: (GENSAT at NCBI - Gene Expression Nervous System Atlas, RRID:SCR_003923)
    As a part of an automated in-house pipeline, sequences were aligned using MAFFT software (v7.2) [12] and trimmed to the beginning of the first reading frame (orf1ab-nsp1).
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    Using data available from the NCBI COVID-19 resource, a modified annotation (GFF) file was generated and open reading frames (ORFs) for each respective viral protein were extracted (taking in to account ribosomal slippage) using bedtools ‘getfasta’ function [13].
    bedtools
    suggested: (BEDTools, RRID:SCR_006646)
    Each ORF was translated using EMBOSS transeq software [14] and the variants for each protein sequence were identified using an in-house script.
    EMBOSS
    suggested: (EMBOSS, RRID:SCR_008493)
    Using BLASTp [26] we mapped short amino acid epitope sequences onto the canonical sequence of SARS-CoV-2 proteins.
    BLASTp
    suggested: (BLASTP, RRID:SCR_001010)
    Coronavirus homology analysis: Reference proteomes for SARS, MERS, OC43, 229E, HKU1 and NL63 α and β coronavirus (-CoV) species were sourced from UniProt database.
    UniProt
    suggested: (UniProtKB, RRID:SCR_004426)
    Homologous peptide sequences with a BLAST bitscore indicating 10 or more residues mapped to the target sequence were recorded and parsed for display on the graph.
    BLAST
    suggested: (BLASTX, RRID:SCR_001653)
    The BioCircos.js library [10] was used to generate the interactive plot and Datatables.net libraries for the table.
    BioCircos
    suggested: None
    For the temporal/geographic non-synonymous mutation plots, we partitioned the whole genome sequencing dataset by week and continent and plotted non-synonymous allele frequencies using the Google Charts JavaScript libraries.
    Google
    suggested: (Google, RRID:SCR_017097)

    Results from OddPub: Thank you for sharing your code and data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.