Mutational signatures in countries affected by SARS-CoV-2: Implications in host-pathogen interactome

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

We are in the midst of the third severe coronavirus outbreak caused by SARS-CoV-2 with unprecedented health and socio-economic consequences due to the COVID-19. Globally, the major thrust of scientific efforts has shifted to the design of potent vaccine and anti-viral candidates. Earlier genome analyses have shown global dominance of some mutations purportedly indicative of similar infectivity and transmissibility of SARS-CoV-2 worldwide. Using high-quality large dataset of 25k whole-genome sequences, we show emergence of new cluster of mutations as result of geographic evolution of SARS-CoV-2 in local population (≥10%) of different nations. Using statistical analysis, we observe that these mutations have either significantly co-occurred in globally dominant strains or have shown mutual exclusivity in other cases. These mutations potentially modulate structural stability of proteins, some of which forms part of SARS-CoV-2-human interactome. The high confidence druggable host proteins are also up-regulated during SARS-CoV-2 infection. Mutations occurring in potential hot-spot regions within likely T-cell and B-cell epitopes or in proteins as part of host-viral interactome, could hamper vaccine or drug efficacy in local population. Overall, our study provides comprehensive view of emerging geo-clonal mutations which would aid researchers to understand and develop effective countermeasures in the current crisis.

Significance

Our comparative analysis of globally dominant mutations and region-specific mutations in 25k SARS-CoV-2 genomes elucidates its geo-clonal evolution. We observe locally dominant mutations (co-occurring or mutually exclusive) in nations with contrasting COVID-19 mortalities per million of population) besides globally dominant ones namely, P314L (ORF1b) and D164G (S) type. We also see exclusive dominant mutations such as in Brazil (I33T in ORF6 and I292T in N protein), England (G251V in ORF3a), India (T2016K and L3606F in ORF1a) and in Spain (L84S in ORF8). The emergence of these local mutations in ORFs within SARS-CoV-2 genome could have interventional implications and also points towards their potential in modulating infectivity of SARS-CoV-2 in regional population.

Article activity feed

  1. SciScore for 10.1101/2020.09.17.301614: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Glimmer was used for gene prediction (ORFs), and ORF sequences with Ns were masked out and were not considered for protein translation (24).
    Glimmer
    suggested: (Glimmer, RRID:SCR_011931)
    MAFFT was used for ORF alignment and mutations were computed using BioInception’s in-house pipeline based on R (https://cran.r-project.org/) and Python.
    MAFFT
    suggested: (MAFFT, RRID:SCR_011811)
    https://cran.r-project.org/
    suggested: (CRAN, RRID:SCR_003005)
    Python
    suggested: (IPython, RRID:SCR_001658)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.