Ubiquitous Forbidden Order in R-group classified protein sequence of SARS-CoV-2 and other viruses
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Each amino acid in a polypeptide chain has a distinctive R-group associated with it. We report here a novel method of species characterization based upon the order of these R-group classified amino acids in the linear sequence of the side chains associated with the codon triplets. In an otherwise pseudo-random sequence, we search for forbidden combinations of k th order. We applied this method to analyze the available protein sequences of various viruses including SARS-CoV-2. We found that these ubiquitous forbidden orders (UFO) are unique to each of the viruses we analyzed. This unique structure of the viruses may provide an insight into viruses’ chemical behavior and the folding patterns of the proteins. This finding may have a broad significance for the analysis of coding sequences of species in general.
Article activity feed
-
SciScore for 10.1101/2020.08.21.261289: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Coding sequences of species were downloaded from the National Centre for Bioinformatics (NCBI) website (https://www.ncbi.nlm.nih.gov/). https://www.ncbi.nlm.nih.gov/suggested: (GENSAT at NCBI - Gene Expression Nervous System Atlas, RRID:SCR_003923)A MATLAB code reads the sequence and classifies each amino acid triplet as its respective R-group side chain. MATLABsuggested: (MATLAB, RRID:SCR_001622)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: …SciScore for 10.1101/2020.08.21.261289: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Coding sequences of species were downloaded from the National Centre for Bioinformatics (NCBI) website (https://www.ncbi.nlm.nih.gov/). https://www.ncbi.nlm.nih.gov/suggested: (GENSAT at NCBI - Gene Expression Nervous System Atlas, RRID:SCR_003923)A MATLAB code reads the sequence and classifies each amino acid triplet as its respective R-group side chain. MATLABsuggested: (MATLAB, RRID:SCR_001622)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- No funding statement was detected.
- No protocol registration statement was detected.
-