Mutational analysis and assessment of its impact on proteins of SARS-CoV-2 genomes from India

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

The ongoing global pandemic of SARS-CoV-2 implies a corresponding accumulation of mutations. Herein the mutational status of 611 genomes from India along with their impact on proteins was ascertained. After excluding gaps and ambiguous sequences, a total of 493 variable sites (152 parsimony informative and 341 singleton) were observed. The most prevalent reference nucleotide was C (209) and substituted one was T (293). NSP3 had the highest incidence of 101 sites followed by S protein (74 sites), NSP12b (43 sites) and ORF3a (31 sites). The average number of mutations per sample for males and females was 2.56 and 2.88 respectively suggesting a higher contribution of mutations from females. Non-uniform geographical distribution of mutations implied by Odisha (30 samples, 109 mutations) and Tamil Nadu (31 samples, 40 mutations) suggests that sequences in some regions are mutating faster than others. There were 281 mutations (198 ‘Neutral’ and 83 ‘Disease’) affecting amino acid sequence. NSP13 has a maximum of 14 ‘Disease’ variants followed by S protein and ORF3a with 13 each. Further, constitution of ‘Disease’ mutations in genomes from asymptomatic people was mere 11% but those from deceased patients was over three folds higher at 38% indicating contribution of these mutations to the pathophysiology of the SARS-CoV-2.

Article activity feed

  1. SciScore for 10.1101/2020.10.19.345066: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Nucleotide Analysis: MEGA(v.10) is a multithreaded tool for molecular and evolutionary analysis.
    MEGA(v.10
    suggested: None
    We used these tools for annotation, identification and classification of mutated protein followed by verification and validation of the positions with the mutated nucleotide sites by the output of MEGA.
    MEGA
    suggested: (Mega BLAST, RRID:SCR_011920)
    The nucleotide similarity percentage was validated by NCBI BLAST (blast.ncbi.nlm.nih.gov) to investigate the sequence diversity.
    NCBI BLAST
    suggested: (NCBI BLAST, RRID:SCR_004870)
    It is expected that a SIFT score of < 0.05 is diseased (“affect protein function”), and that > 0.05 is neutral (“tolerated”).
    SIFT
    suggested: (SIFT, RRID:SCR_012813)
    This is stated that a PROVEAN score of < −2.5 is diseased (“deleterious”), and > −2.5 is neutral.
    PROVEAN
    suggested: (PROVEAN, RRID:SCR_002182)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.