Mutational analysis and assessment of its impact on proteins of SARS-CoV-2 genomes from India

Abstract

The ongoing global pandemic of SARS-CoV-2 implies a corresponding accumulation of mutations. Herein the mutational status of 611 genomes from India along with their impact on proteins was ascertained. After excluding gaps and ambiguous sequences, a total of 493 variable sites (152 parsimony informative and 341 singleton) were observed. The most prevalent reference nucleotide was C (209) and substituted one was T (293). NSP3 had the highest incidence of 101 sites followed by S protein (74 sites), NSP12b (43 sites) and ORF3a (31 sites). The average number of mutations per sample for males and females was 2.56 and 2.88 respectively suggesting a higher contribution of mutations from females. Non-uniform geographical distribution of mutations implied by Odisha (30 samples, 109 mutations) and Tamil Nadu (31 samples, 40 mutations) suggests that sequences in some regions are mutating faster than others. There were 281 mutations (198 ‘Neutral’ and 83 ‘Disease’) affecting amino acid sequence. NSP13 has a maximum of 14 ‘Disease’ variants followed by S protein and ORF3a with 13 each. Further, constitution of ‘Disease’ mutations in genomes from asymptomatic people was mere 11% but those from deceased patients was over three folds higher at 38% indicating contribution of these mutations to the pathophysiology of the SARS-CoV-2.

SciScore for 10.1101/2020.10.19.345066: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Nucleotide Analysis: MEGA(v.10) is a multithreaded tool for molecular and evolutionary analysis.	MEGA(v.10 suggested: None
We used these tools for annotation, identification and classification of mutated protein followed by verification and validation of the positions with the mutated nucleotide sites by the output of MEGA.	MEGA suggested: (Mega BLAST, RRID:SCR_011920)
The nucleotide similarity percentage was validated by NCBI BLAST (blast.ncbi.nlm.nih.gov) to investigate the sequence …

SciScore for 10.1101/2020.10.19.345066: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

Institutional Review Board Statement	not detected.
Randomization	not detected.
Blinding	not detected.
Power Analysis	not detected.
Sex as a biological variable	not detected.

Table 2: Resources

Software and Algorithms
Sentences	Resources
Nucleotide Analysis: MEGA(v.10) is a multithreaded tool for molecular and evolutionary analysis.	MEGA(v.10 suggested: None
We used these tools for annotation, identification and classification of mutated protein followed by verification and validation of the positions with the mutated nucleotide sites by the output of MEGA.	MEGA suggested: (Mega BLAST, RRID:SCR_011920)
The nucleotide similarity percentage was validated by NCBI BLAST (blast.ncbi.nlm.nih.gov) to investigate the sequence diversity.	NCBI BLAST suggested: (NCBI BLAST, RRID:SCR_004870)
It is expected that a SIFT score of < 0.05 is diseased (“affect protein function”), and that > 0.05 is neutral (“tolerated”).	SIFT suggested: (SIFT, RRID:SCR_012813)
This is stated that a PROVEAN score of < −2.5 is diseased (“deleterious”), and > −2.5 is neutral.	PROVEAN suggested: (PROVEAN, RRID:SCR_002182)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
No funding statement was detected.
No protocol registration statement was detected.

Read the original source

Mutational analysis and assessment of its impact on proteins of SARS-CoV-2 genomes from India

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

Virulence gene profiles of extraintestinal pathogenic Escherichia coli (ExPEC) from Zambia: a secondary in silico study

Genomic characterisation of Mycoplasma genitalium in Victoria, Australia, reveals lineage diversification and drivers of antimicrobial resistance.

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

DIVERSITY AND CLINICAL CORRELATIONS OF SARS-CoV-2 VARIANT DURING THE INTRODUCTION OF THE DELTA VARIANT IN GUATEMALA

Virulence gene profiles of extraintestinal pathogenic Escherichia coli (ExPEC) from Zambia: a secondary in silico study

Genomic characterisation of Mycoplasma genitalium in Victoria, Australia, reveals lineage diversification and drivers of antimicrobial resistance.