Development of Web Application for the Comparison of Segment Variability with Sequence Evolution and Immunogenic Properties for Highly Variable Proteins: An Application to Viruses
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
High rate of mutation and structural flexibilities in viral proteins quickly make them resistant to the host immune system and existing antiviral strategies. For most of the pathogenic viruses, the key survival strategies lie in their ability to evolve rapidly through mutations that affects the protein structure and function. Along with the experimental research related to antiviral development, computational data mining also plays an important role in deciphering the molecular and genomic signatures of the viral adaptability. Uncovering conserved regions in viral proteins with diverse chemical and biological properties is an important area of research for developing antiviral therapeutics, though assigning those regions is not a trivial work. Advancement in protein structural information databases and repositories, made by experimental research accelerated the in-silico mining of the data to generate more integrative information. Despite of the huge effort on correlating the protein structural information with its sequence, it is still a challenge to defeat the high mutability and adaptability of the viral genomics structure. In this current study, the authors have developed a user-friendly web application interface that will allow users to study and visualize protein segment variabilities in viral proteins and may help to find antiviral strategies. The present work of web application development allows thorough mining of the surface properties and variabilities of viral proteins which in combination with immunogenicity and evolutionary properties make the visualization robust. In combination with previous research on 20-Dimensional Euclidian Geometry based sequence variability characterization algorithm, four other parameters has been considered for this platform: [1] predicted solvent accessibility information, [2] B-Cell epitopic potential, [3] T-Cell epitopic potential and [4] coevolving region of the viral protein. Uniqueness of this study lies in the fact that a protein sequence stretch is being characterized rather than single residue-based information, which helps to compare properties of protein segments with variability. In current work, as an example, beside presenting the web application platform, five proteins of SARS-CoV2 was presented with keeping focus on protein-S. Current web-application database contains 29 proteins from 7 viruses including a GitHub repository of the raw data used in this study. The web application is up and running in the following address: http://www.protsegvar.com .
Article activity feed
-
SciScore for 10.1101/2021.12.01.470810: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Recombinant DNA Sentences Resources This web application was designed to perform two types of job: [a] A simple pR-value calculator, where user can submit a single protein sequence to obtain pR-value (Ghosh et al., 2010) (Figure 1). pR-valuesuggested: NoneSoftware and Algorithms Sentences Resources For the processing and preparation of FASTA file SeqKit FASTA/Q file manipulation tool were used (Shen et al., 2016) (https://github.com/shenwei356/seqkit) and multiple alignment were performed in MUSCLE (Edgar, 2004) (http://www.drive5.com/muscle/). pyDCA package were used for the purpose of coevolution analysis (Zerihun et al., 2020) … SciScore for 10.1101/2021.12.01.470810: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Recombinant DNA Sentences Resources This web application was designed to perform two types of job: [a] A simple pR-value calculator, where user can submit a single protein sequence to obtain pR-value (Ghosh et al., 2010) (Figure 1). pR-valuesuggested: NoneSoftware and Algorithms Sentences Resources For the processing and preparation of FASTA file SeqKit FASTA/Q file manipulation tool were used (Shen et al., 2016) (https://github.com/shenwei356/seqkit) and multiple alignment were performed in MUSCLE (Edgar, 2004) (http://www.drive5.com/muscle/). pyDCA package were used for the purpose of coevolution analysis (Zerihun et al., 2020) (https://pypi.org/project/pydca/). 2.2. Calculation of 20D algorithm-based variability of proteins: Variability calculation of the viral protein sequence stretches were performed using 20-dimensional Cartesian coordinate based graphical representation algorithm (Nandy et al., 2009, Ghosh et al., 2010). MUSCLEsuggested: (MUSCLE, RRID:SCR_011812)Development of web application: Web application was developed in Flask microframework, with a front-end user interface (UI) designed in JavaScript and backend in Python. Pythonsuggested: (IPython, RRID:SCR_001658)Results from OddPub: Thank you for sharing your code and data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-
