An integrated in silico immuno-genetic analytical platform provides insights into COVID-19 serological and vaccine targets
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
Background
The COVID-19 pandemic, caused by the SARS-CoV-2 virus, has a major global health and socio-economic burden. It has instigated the mobilisation of resources into the development of control tools, such as diagnostics and vaccines. The poor performance of some diagnostic serological tools has emphasised the need for up to date immune-informatic analyses to inform the selection of viable targets for further study. This requires the integration and analysis of genetic and immunological data for SARS-CoV-2 and its homology with other human coronavirus species to understand cross-reactivity.
Methods
We have developed an online “immuno-analytics” resource to facilitate SARS-CoV-2 research, combining an extensive B/T-cell epitope mapping and prediction meta-analysis, and human CoV sequence homology mapping and protein database annotation, with an updated variant database and geospatial tracking for >7,800 non-synonymous mutation positions derived from >150,000 whole genome sequences. To demonstrate its utility, we present an integrated analysis of SARS-CoV-2 spike and nucleocapsid proteins, both being vaccine and serological diagnostic targets, including an analysis of changes in relevant mutation frequencies over time.
Results
Our analysis reveals that the nucleocapsid protein in its native form appears to be a sub-optimal target for use in serological diagnostic platforms. The most frequent mutations were the spike protein D614G and nsp12 L314P, which were common (>86%) across all the geographical regions. Some mutations in the spike protein (e.g. A222V and L18F) have increased in frequency in Europe during the latter half of 2020, detected using our automated algorithms. The tool also suggests that orf3a proteins may be a suitable alternative target for diagnostic serologic assays in a post-vaccine surveillance setting.
Conclusions
The immuno-analytics tool can be accessed online ( http://genomics.lshtm.ac.uk/immuno ) and will serve as a useful resource for biological discovery and surveillance in the fight against SARS-CoV-2. Further, the tool may be adapted to inform on biological targets in future outbreaks, including potential emerging human coronaviruses that spill over from animal hosts.
Article activity feed
-
SciScore for 10.1101/2020.05.11.089409: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Whole genome sequence data analysis: SARS-CoV-2 nucleotide sequences were downloaded from NCBI (https://www.ncbi.nlm.nih.gov) and GISAID (https://www.gisaid.org). https://www.ncbi.nlm.nih.govsuggested: (GENSAT at NCBI - Gene Expression Nervous System Atlas, RRID:SCR_003923)As a part of an automated in-house pipeline, sequences were aligned using MAFFT software (v7.2) [12] and trimmed to the beginning of the first reading frame (orf1ab-nsp1). MAFFTsuggested: (MAFFT, RRID:SCR_011811)Using data available from the NCBI COVID-19 resource, a modified annotation (GFF) file was generated and open … SciScore for 10.1101/2020.05.11.089409: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Whole genome sequence data analysis: SARS-CoV-2 nucleotide sequences were downloaded from NCBI (https://www.ncbi.nlm.nih.gov) and GISAID (https://www.gisaid.org). https://www.ncbi.nlm.nih.govsuggested: (GENSAT at NCBI - Gene Expression Nervous System Atlas, RRID:SCR_003923)As a part of an automated in-house pipeline, sequences were aligned using MAFFT software (v7.2) [12] and trimmed to the beginning of the first reading frame (orf1ab-nsp1). MAFFTsuggested: (MAFFT, RRID:SCR_011811)Using data available from the NCBI COVID-19 resource, a modified annotation (GFF) file was generated and open reading frames (ORFs) for each respective viral protein were extracted (taking in to account ribosomal slippage) using bedtools ‘getfasta’ function [13]. bedtoolssuggested: (BEDTools, RRID:SCR_006646)Each ORF was translated using EMBOSS transeq software [14] and the variants for each protein sequence were identified using an in-house script. EMBOSSsuggested: (EMBOSS, RRID:SCR_008493)Using BLASTp [26] we mapped short amino acid epitope sequences onto the canonical sequence of SARS-CoV-2 proteins. BLASTpsuggested: (BLASTP, RRID:SCR_001010)Coronavirus homology analysis: Reference proteomes for SARS, MERS, OC43, 229E, HKU1 and NL63 α and β coronavirus (-CoV) species were sourced from UniProt database. UniProtsuggested: (UniProtKB, RRID:SCR_004426)Homologous peptide sequences with a BLAST bitscore indicating 10 or more residues mapped to the target sequence were recorded and parsed for display on the graph. BLASTsuggested: (BLASTX, RRID:SCR_001653)The BioCircos.js library [10] was used to generate the interactive plot and Datatables.net libraries for the table. BioCircossuggested: NoneFor the temporal/geographic non-synonymous mutation plots, we partitioned the whole genome sequencing dataset by week and continent and plotted non-synonymous allele frequencies using the Google Charts JavaScript libraries. Googlesuggested: (Google, RRID:SCR_017097)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
