Phylogenetic clustering of the Indian SARS-CoV-2 genomes reveals the presence of distinct clades of viral haplotypes among states
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The first Indian cases of COVID-19 caused by SARS-Cov-2 were reported in February 29, 2020 with a history of travel from Wuhan, China and so far above 4500 deaths have been attributed to this pandemic. The objectives of this study were to characterize Indian SARS-CoV-2 genome-wide nucleotide variations, trace ancestries using phylogenetic networks and correlate state-wise distribution of viral haplotypes with differences in mortality rates. A total of 305 whole genome sequences from 19 Indian states were downloaded from GISAID. Sequences were aligned using the ancestral Wuhan-Hu genome sequence (NC_045512.2). A total of 633 variants resulting in 388 amino acid substitutions were identified. Allele frequency spectrum, and nucleotide diversity (π) values revealed the presence of higher proportions of low frequency variants and negative Tajima’s D values across ORFs indicated the presence of population expansion. Network analysis highlighted the presence of two major clusters of viral haplotypes, namely, clade G with the S:D614G, RdRp: P323L variants and a variant of clade L [L v ] having the RdRp:A97V variant. Clade G genomes were found to be evolving more rapidly into multiple sub-clusters including clade GH and GR and were also found in higher proportions in three states with highest mortality rates namely, Gujarat, Madhya Pradesh and West Bengal.
Article activity feed
-
SciScore for 10.1101/2020.05.28.122143: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Cell Line Authentication not detected. Table 2: Resources
Experimental Models: Cell Lines Sentences Resources Out of a total of 305 sequences, 26 were found without state information and 7 had been grown in Vero cells. Verosuggested: CLS Cat# 605372/p622_VERO, RRID:CVCL_0059)Software and Algorithms Sentences Resources Multiple sequence alignment was executed using MUSCLE [11] with three iterations for both. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)The SIFT database was used to identify amino acid changes that could protein function … SciScore for 10.1101/2020.05.28.122143: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Cell Line Authentication not detected. Table 2: Resources
Experimental Models: Cell Lines Sentences Resources Out of a total of 305 sequences, 26 were found without state information and 7 had been grown in Vero cells. Verosuggested: CLS Cat# 605372/p622_VERO, RRID:CVCL_0059)Software and Algorithms Sentences Resources Multiple sequence alignment was executed using MUSCLE [11] with three iterations for both. MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)The SIFT database was used to identify amino acid changes that could protein function (http://blocks.fhcrc.org/sift/SIFT_seq_submit2.html) [12]. SIFTsuggested: (SIFT, RRID:SCR_012813)2.2 Measurements of diversity and deviation from neutrality: Watterson’s estimator (θw), nucleotide diversity (π) and Tajima’s D [13] for each open reading frame (ORF) was calculated using MEGA X [14]. MEGAsuggested: (Mega BLAST, RRID:SCR_011920)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-