A missense variant effect prediction and annotation resource for SARS-CoV-2

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

The COVID19 pandemic is a global crisis severely impacting many people across the world. An important part of the response is monitoring viral variants and determining the impact they have on viral properties, such as infectivity, disease severity and interactions with drugs and vaccines. In this work we generate and make available computational variant effect predictions for all possible single amino-acid substitutions to SARS-CoV-2 in order to complement and facilitate experiments and expert analysis. The resulting dataset contains predictions from evolutionary conservation and protein and complex structural models, combined with viral phosphosites, experimental results and variant frequencies. We demonstrate predictions’ effectiveness by comparing them with expectations from variant frequency and prior experiments. We then identify higher frequency variants with significant predicted effects as well as finding variants measured to impact antibody binding that are least likely to impact other viral functions. A web portal is available at sars.mutfunc.com , where the dataset can be searched and downloaded.

Article activity feed

  1. SciScore for 10.1101/2021.02.24.432721: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    A custom reference database was generated based on the NCBI virus coronavirus genomes dataset (NCBI Resource Coordinators, 2018), which includes sequences from a large range of coronaviruses.
    NCBI Resource Coordinators
    suggested: None
    Models were examined in turn and any position not covered by a higher priority model was added to the FoldX analysis pipeline.
    FoldX
    suggested: (FoldX, RRID:SCR_008522)
    It was filtered to exclude problematic sites using VCFTools, based on the annotation at https://github.com/W-L/ProblematicSites_SARS-CoV2/blob/master/problematic_sites_sarsCov2.vcf.
    VCFTools
    suggested: (VCFtools, RRID:SCR_001235)
    The SARS-CoV-2 genome was sourced from Ensembl (Yates et al., 2020) and Tabix indexed.
    Ensembl
    suggested: (Ensembl, RRID:SCR_002344)

    Results from OddPub: Thank you for sharing your code.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.