Comparative Genomics and Integrated Network Approach Unveiled Undirected Phylogeny Patterns, Co-mutational Hotspots, Functional Crosstalk and Regulatory Interactions in SARS-CoV-2
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
SARS-CoV-2 pandemic resulted in 92 million cases in a span of one year. The study focuses on understanding population specific variations attributing its high rate of infections in specific geographical regions particularly in USA. Rigorous phylogenomic network analysis of complete SARS-CoV-2 genomes (245) inferred five central clades named a (ancestral), b, c, d and e (subtype e1 & e2). The clade d & e2 were found exclusively comprising of USA. Clades were distinguished by 10 co-mutational combinations in Nsp3, ORF8, Nsp13, S, Nsp12, Nsp2 and Nsp6. Our analysis revealed that only 67.46% of SNP mutations were at amino acid level. T1103P mutation in Nsp3 was predicted to increase protein stability in 238 strains except 6 strains which were marked as ancestral type; whereas co-mutation (P409L & Y446C) in Nsp13 were found in 64 genomes from USA highlighting its 100% co-occurrence. Docking highlighted mutation (D614G) caused reduction in binding of Spike proteins with ACE2, but it also showed better interaction with TMPRSS2 receptor contributing to high transmissibility among USA strains. We also found host proteins, MYO5A, MYO5B, MYO5C had maximum interaction with viral proteins (N, S, M). Thus, blocking the internalization pathway by inhibiting MYO5 proteins which could be an effective target for COVID-19 treatment. The functional annotations of the HPI network were found to be closely associated with hypoxia and thrombotic conditions confirming the vulnerability and severity of infection. We also screened CpG islands in Nsp1 & N conferring ability of SARS-CoV-2 to enter and trigger ZAP activity inside host cell.
Importance
In the current study we presented a global view of mutational pattern observed in SARS-CoV-2 virus transmission. This provided a who-infect-whom geographical model since the early pandemic. This is hitherto the most comprehensive comparative genomics analysis of full-length genomes for co-mutations at different geographical regions specially in USA strains. Compositional structural biology results suggested that mutations have balance of contrary forces effect on pathogenicity suggesting only few mutations to effective at translation level but not all. Novel HPI analysis and CpG predictions elucidates the proof of concept of hypoxia and thrombotic conditions in several patients. Thus, the current study focuses the understanding of population specific variations attributing high rate of SARS-CoV-2 infections in specific geographical regions which may eventually be vital for the most severely affected countries and regions for sharp development of custom-made vindication strategies.
Article activity feed
-
-
SciScore for 10.1101/2020.06.20.162560: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources The alignments so obtained were processed for phylogeny construction using BioEdit software (18). BioEditsuggested: (BioEdit, RRID:SCR_007361)Data and Computer programs: The genomic analytics is performed using programs in Python and Biopython libraries (22). Pythonsuggested: (IPython, RRID:SCR_001658)Biopythonsuggested: (Biopython, RRID:SCR_007173)To find the Host Pathogen Interaction (HPI), we subjected SARS-CoV-2 proteins sequence to Host-Pathogen interaction databases such as Viruses STRING v10.5 (24) and HPIDB3.0 (25) to predict their direct interaction with human as the principal host. …SciScore for 10.1101/2020.06.20.162560: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources The alignments so obtained were processed for phylogeny construction using BioEdit software (18). BioEditsuggested: (BioEdit, RRID:SCR_007361)Data and Computer programs: The genomic analytics is performed using programs in Python and Biopython libraries (22). Pythonsuggested: (IPython, RRID:SCR_001658)Biopythonsuggested: (Biopython, RRID:SCR_007173)To find the Host Pathogen Interaction (HPI), we subjected SARS-CoV-2 proteins sequence to Host-Pathogen interaction databases such as Viruses STRING v10.5 (24) and HPIDB3.0 (25) to predict their direct interaction with human as the principal host. STRINGsuggested: (STRING, RRID:SCR_005223)In these databases, the virus–host interaction was imported from different PPI databases like MintAct (26), IntAct (26), HPIDB (25) and VirusMentha (27). IntActsuggested: (IntAct, RRID:SCR_006944)For high-throughput analysis, it searches multiple protein sequences at a time using BLASTp and obtain results in tabular and sequence alignment formats (28). BLASTpsuggested: (BLASTP, RRID:SCR_001010), plugin of Cytoscape v3.7.2, we identified the hub protein. Cytoscapesuggested: (Cytoscape, RRID:SCR_003032)Gene ontology (GO) analysis was performed using ClueGo (31), selecting the Kyoto Encyclopedia of Genes and Genomes (KEGG) (32) ClueGosuggested: (ClueGO, RRID:SCR_005748)KEGGsuggested: (KEGG, RRID:SCR_012773), Gene Ontology—biological function database, and Reactome Pathways (33) databases. Gene Ontology—biologicalsuggested: NoneComputational structural analysis on wild-type and mutant SARS-CoV-2 proteins: SARS-CoV-2 proteins sequences were retrieved from the NCBI genome database and pairwise sequence alignment of wild-type and mutant proteins were carried out by the Clustal Omega tool (34). Clustal Omegasuggested: (Clustal Omega, RRID:SCR_001591)The docking studies for wild and mutant SARS-CoV-2 proteins with host proteins was carried out using PatchDock Server (40) PatchDocksuggested: (PatchDock, RRID:SCR_017589)The presence of common CpG islands was confirmed by performing BLAST using the above reference strain. BLASTsuggested: (BLASTX, RRID:SCR_001653)Results from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-