Comparative Analysis of Human Coronaviruses Focusing on Nucleotide Variability and Synonymous Codon Usage Pattern
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
Prevailing pandemic across the world due to SARSCoV-2 drawing great attention towards discovering its evolutionary origin. We perform an exploratory study to understand the variability of the whole coding region of possible proximal evolutionary neighbours of SARSCoV-2. We consider seven (07) human coronavirus strains from six different species as a candidate for our study.
First, we observe a good variability of nucleotides across candidate strains. We did not find a significant variation of GC content across the strains for codon position first and second. However, we interestingly see huge variability of GC-content in codon position 3rd (GC3), and pairwise mean GC-content (SARSCoV, MERSCoV), and (SARSCoV-2, hCoV229E) are quite closer. While observing the relative abundance of dinucleotide feature, we find a shared typical genetic pattern, i.e., high usage of GC and CT nucleotide pair at the first two positions (P12) of codons and the last two positions (P23) of codons, respectively. We also observe a low abundance of CG pair that might help in their evolution bio-process. Secondly, Considering RSCU score, we find a substantial similarity for mild class coronaviruses, i.e., hCoVOC43, hCoVHKU1, and hCoVNL63 based on their codon hit with high RSCU value (≥ 1.5), and minim number of codons hit (count-9) is observed for MERSCoV. We see seven codons ATT, ACT, TCT, CCT, GTT, GCT and GGT with high RSCU value, which are common in all seven strains. These codons are mostly from Aliphatic and Hydroxyl amino acid group. A phylogenetic tree built using RSCU feature reveals proximity among hCoVOC43 and hCoV229E (mild). Thirdly, we perform linear regression analysis among GC content in different codon position and ENC value. We observe a strong correlation (significant p-value) between GC2 and GC3 for SARSCoV-2, hCoV229E and hCoVNL63, and between GC1 and GC3 for hCoV229E, hCoVNL63, SARSCoV. We believe that our findings will help in understanding the mechanism of human coronavirus.
Article activity feed
-
SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Nucleotide sequences for each strain of length ≈ 28kb are collected from NCBI database during April, 2020. NCBIsuggested: (NCBI, RRID:SCR_006472)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study …SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Nucleotide sequences for each strain of length ≈ 28kb are collected from NCBI database during April, 2020. NCBIsuggested: (NCBI, RRID:SCR_006472)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-
SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding …
SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.
-
SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding …
SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
No key resources detected.
Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.
-
SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Nucleotide sequences for each strain of length ≈ 28kb are collected from NCBI database during April, 2020. NCBIsuggested: (NCBI, SCR_006472)We observe that the content of A is high for SARSCoV2, and is low for MERSCoV and hCoVNL63; the content of T is high for hCoVHKU1 and hCoVNL63, and is low for SARSCoV; the content of C is high for SARSCoV and MERSCoV, and is low hCoVHKU1; the content of G is high for hCoVOC43 and hCoV229E, and is low for hCoVHKU1. SARSCoVsuggested: None…SciScore for 10.1101/2020.07.28.224386: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Nucleotide sequences for each strain of length ≈ 28kb are collected from NCBI database during April, 2020. NCBIsuggested: (NCBI, SCR_006472)We observe that the content of A is high for SARSCoV2, and is low for MERSCoV and hCoVNL63; the content of T is high for hCoVHKU1 and hCoVNL63, and is low for SARSCoV; the content of C is high for SARSCoV and MERSCoV, and is low hCoVHKU1; the content of G is high for hCoVOC43 and hCoV229E, and is low for hCoVHKU1. SARSCoVsuggested: NoneData from additional tools added to each annotation on a weekly basis.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.
-
