A putative new SARS-CoV protein, 3a*, encoded in an ORF overlapping ORF3a

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

Identification of the full complement of genes in SARS-CoV-2 is a crucial step towards gaining a fuller understanding of its molecular biology. However, short and/or overlapping genes can be difficult to detect using conventional computational approaches, whereas high throughput experimental approaches – such as ribosome profiling – cannot distinguish translation of functional peptides from regulatory translation or translational noise. By studying regions showing enhanced conservation at synonymous sites in alignments of SARS-CoV and related viruses (subgenus Sarbecovirus ), and correlating with the conserved presence of an open reading frame and plausible translation mechanism, we identified a putative new gene, ORF3a*, overlapping ORF3a in an alternative reading frame. A recently published ribosome profiling study confirmed that ORF3a* is indeed translated during infection. ORF3a* is conserved across the subgenus Sarbecovirus , and encodes a 40–41 amino acid predicted transmembrane protein.

Article activity feed

  1. SciScore for 10.1101/2020.05.12.088088: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Conservation statistics were then mapped back to NC_045512.2 coordinates for plotting (Fig. 1).
    Conservation
    suggested: (Conservation, RRID:SCR_016064)
    For the ORF S and 3a analyses (Fig. 3), the ORF S and 3a regions were extracted from all 54 sarbecovirus sequences, translated to amino acids, aligned using MUSCLE (Edgar, 2004), and the amino acid alignments were used to guide codon-respecting nucleotide sequence alignments (EMBOSS tranalign; Rice et al., 2000).
    MUSCLE
    suggested: (MUSCLE, RRID:SCR_011812)
    Molecular mass and isoelectric point were calculated with EMBOSS pepstats (Rice et al., 2000).
    EMBOSS
    suggested: (EMBOSS, RRID:SCR_008493)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.