SARS-CoV-2 has observably higher propensity to accept uracil as nucleotide substitution: Prevalence of amino acid substitutions and their predicted functional implications in circulating SARS-CoV-2 in India up to July, 2020

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

SARS-CoV-2 has emerged as pandemic all over the world since late 2019. In this study, we investigated the diversity of the virus in the context of SARS-CoV-2 spread in India. Full-length SARS-CoV-2 genome sequences of the circulating viruses from all over India were collected from GISAID, an open data repository, until 25 th July, 2020. We have focused on the non-synonymous changes across the genome that resulted in amino acid substitutions. Analysis of the genomic signatures of the non-synonymous mutations demonstrated a strong association between the time of sample collection and the accumulation of genetic diversity. Most of these isolates from India belonged to the A2a clade (63.4%) which has overcome the selective pressure and is spreading rapidly across several continents. Interestingly a new clade I/A3i has emerged as the second-highest prevalent type among the Indian isolates, comprising 25.5% of the Indian sequences. Emergence of new mutations in the S protein was observed. Major SARS-CoV-2 clades in India have defining mutations in the RdRp. Maximum accumulation of mutations was observed in ORF1a.

Other than the clade-defining mutations, few representative non-synonymous mutations were checked against the available crystal structures of the SARS-CoV-2 proteins in the DynaMut server to assess their thermodynamic stability. We have observed that SARS-CoV-2 genomes contain more uracil than any other nucleotide. Furthermore, substitution of nucleotides to uracil was highest among the non-synonymous mutations observed. The A+U content in SARS-CoV-2 genome is much higher compared to other RNA viruses, suggesting that the virus RdRp has a propensity towards uracil incorporation in the genome. This implies that thymidine analogues may have a better chance to competitively inhibit SARS-CoV-2 RNA replication than other nucleotide analogues.

Article activity feed

  1. SciScore for 10.1101/2020.10.07.329771: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Sequences were aligned using the MEGA X software (Kumar et al., 2018).
    MEGA X
    suggested: None
    Analysis of data was done using Bioedit (Hall, Biosciences and Carlsbad, 2011).
    Bioedit
    suggested: (BioEdit, RRID:SCR_007361)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.