Characterization and structural prediction of the putative ORF10 protein in SARS-CoV-2

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Upstream of the 3’-untranslated region in the SARS-CoV-2 genome is ORF10 which has been proposed to encode for the ORF10 protein. Current research is still unclear on whether this protein is synthesized, but further investigations are still warranted. Herein, this study uses multiple bioinformatic tools to biochemically and functionally characterize the ORF10 protein, along with predicting its tertiary structure. Results indicate a highly ordered, hydrophobic, and thermally stable protein that contains at least one transmembrane region. This protein also possesses high residue protein-binding propensity, primarily in the N-terminal half. An assessment of forty-one missense mutations reveal slight changes in residue flexibility, mainly in the C-terminal half. However, these same mutations do not inflict significant changes on protein stability and other biochemical features. The predicted model suggests the ORF10 protein contains a β-α-β motif with a β-molecular recognition feature occurring in the first β-strand. Functionally, the ORF10 protein could be a membrane protein. A single pocket was identified in this protein but found to possess low druggability. The ORF10 itself consists of two distinct lineages: the SARS-CoV lineage and the SARS-CoV-2 lineage. Evidence of strong positive selection (dN/dS = 4.01) and purifying selection (dN/dS = 0.713) were found within the SARS-CoV-2 lineage and SARS-CoV lineage, respectively. Collectively, these results continue to assess the biological relevance of ORF10 and its putatively encoded protein, thereby aiding in diagnostic and possibly vaccine development.

Article activity feed

  1. SciScore for 10.1101/2020.10.26.355784: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    NIH rigor criteria are not applicable to paper type.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    2.1 Sequence, Phylogenetic, and Evolutionary Analysis: The nucleotide (NC_045512.2) and protein (YP_009725255.1) reference sequences for ORF10 in SARS-CoV-2 were acquired from the NCBI RefSeq Database.
    RefSeq
    suggested: (RefSeq, RRID:SCR_003496)
    BLASTn was used to collect sequences from both distantly and closely related CoVs (Table S1).
    BLASTn
    suggested: (BLASTN, RRID:SCR_001598)
    18 Sequence alignments were performed using MUSCLE on the MEGA-X v10.1.7 software.19,20 Alignment reliability and total similarity were determined by overall mean distance and calculated using the p-distance substitution model.
    MUSCLE
    suggested: (MUSCLE, RRID:SCR_011812)
    MEGA-X
    suggested: None
    27 The grand average of hydropathicity (GRAVY), aliphatic index, and instability index were determined using ProtParam.
    ProtParam
    suggested: (ProtParam Tool, RRID:SCR_018087)
    28 Secondary structural elements were predicted using PSIPRED v4.0.29 TM regions were predicted using TMPred.30 2.3 Protein Modeling and Evaluation: The webserver IntFOLD was employed to make use of an ab initio modeling approach in constructing the ORF10 protein.
    PSIPRED
    suggested: (PSIPRED, RRID:SCR_010246)
    33 This final model was evaluated by PROCHECK and ERRAT on the SAVES v6.0 webtool (https://saves.mbi.ucla.edu/).34-36 This theoretical structure was deposited in ModelArchive (https://modelarchive.org/) and given the ID: ma-9yzbf.
    SAVES
    suggested: (SAVES, RRID:SCR_018219)
    37,38 The ProFunc server predicted the function of the ORF10 protein based on the newly produced model.
    ProFunc
    suggested: (ProFunc, RRID:SCR_004450)
    41 Electrostatic surfaces were generated based on the AMBER ff14SB charge model.
    AMBER
    suggested: (AMBER, RRID:SCR_016151)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:
    A major limitation to this study is the quality of sequences due to the varying methods used to acquire them. Although unlikely, there is always the possibility of errors and missing information that may affect the results obtained herein. In addition, with only six sequences used, the evidence of positive selection should not be regarded as conclusive but as additional support for the hypothesis that positive selection is acting on the SARS-CoV-2 lineage for ORF10. Additional sequences of good quality from other closely related CoVs can help to resolve this matter. It is also important to note that mutations analyzed in this study are likely present in other geographic locations; therefore, the list of locations in this study should not be viewed as exhaustive. Overexpression of ORF10 has been shown to occur in severe cases of COVID-19 whereas in milder cases its expression seems to be minimal.16 Therefore, knowing the structure and biochemical characteristics associated with the ORF10 protein might aid in developing diagnostic tests that detect for the ORF10 protein, thereby helping to determine the likely progression the disease may take in patients.16 In addition, based on results by DoGSiteScorer the ORF10 protein would not make for a suitable drug target. Despite not serving as a likely drug target, the predicted structure for the ORF10 protein may be useful in further computational analyses detailing events in viral pathogenesis or virulence. Hopefully, this study can ...

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We found bar graphs of continuous data. We recommend replacing bar graphs with more informative graphics, as many different datasets can lead to the same bar graph. The actual data may suggest different conclusions from the summary statistics. For more information, please see Weissgerber et al (2015).


    Results from JetFighter: Please consider improving the rainbow (“jet”) colormap(s) used on page 22. At least one figure is not accessible to readers with colorblindness and/or is not true to the data, i.e. not perceptually uniform.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.